[ 
http://issues.ops4j.org/browse/PAXEXAM-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13344#action_13344
 ] 

Bartosz Kowalewski commented on PAXEXAM-174:
--------------------------------------------

I also took a look at Pax Exam sources. I found a mechanism that is responsible 
for checking if a port is free. I got an impression that this mechanism is 
unsafe. Pax Exam attempts to create a server socket to verify if a port is 
free. Then the socket is closed and there's a short time window before this 
port is used to create an RMI registry. During this time the port might be 
bound by other apps. I can imagine that on the build machine there are many 
apps that try to use the 1099 port. I'm not sure if it is the issue here, but 
it could be.
I guess, the RMI registry could be created in the process that launches the 
test and then the activator could just locate it. This way we could get rid of 
the unsafe mechanism.


> non-JRMP server at remote endpoint
> ----------------------------------
>
>                 Key: PAXEXAM-174
>                 URL: http://issues.ops4j.org/browse/PAXEXAM-174
>             Project: Pax Exam
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>         Environment: Linux, Java 1.6_u16
>            Reporter: Bartosz Kowalewski
>            Assignee: Toni Menzel
>
> We observe the following problem when running pax exam tests on build machine 
> (which runs Linux OS). It doesn't happen every time. This problem doesn't 
> occur on our Windows laptops.
> {code}
> org.ops4j.pax.exam.spi.container.TestContainerException: Cannot get the 
> remote bundle context
>       at 
> org.ops4j.pax.exam.rbc.client.RemoteBundleContextClient.getRemoteBundleContext(RemoteBundleContextClient.java:324)
>       at 
> org.ops4j.pax.exam.rbc.client.RemoteBundleContextClient.waitForState(RemoteBundleContextClient.java:265)
>       at 
> org.ops4j.pax.exam.container.def.internal.PaxRunnerTestContainer.waitForState(PaxRunnerTestContainer.java:317)
>       at 
> org.ops4j.pax.exam.container.def.internal.PaxRunnerTestContainer.start(PaxRunnerTestContainer.java:272)
>       at 
> org.ops4j.pax.exam.junit.internal.JUnit4TestMethod.invoke(JUnit4TestMethod.java:142)
>       at 
> org.junit.internal.runners.MethodRoadie.runTestMethod(MethodRoadie.java:105)
>       at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:86)
>       at 
> org.ops4j.pax.exam.junit.internal.JUnit4MethodRoadie.runBeforesThenTestThenAfters(JUnit4MethodRoadie.java:60)
>       at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:84)
>       at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
>       at 
> org.ops4j.pax.exam.junit.JUnit4TestRunner.invokeTestMethod(JUnit4TestRunner.java:246)
>       at 
> org.ops4j.pax.exam.junit.JUnit4TestRunner.runMethods(JUnit4TestRunner.java:196)
>       at 
> org.ops4j.pax.exam.junit.JUnit4TestRunner$2.run(JUnit4TestRunner.java:186)
>       at 
> org.junit.internal.runners.ClassRoadie.runUnprotected(ClassRoadie.java:34)
>       at 
> org.junit.internal.runners.ClassRoadie.runProtected(ClassRoadie.java:44)
>       at 
> org.ops4j.pax.exam.junit.JUnit4TestRunner.run(JUnit4TestRunner.java:182)
>       at 
> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
>       at 
> org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
>       at 
> org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
>       at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
>       at 
> org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
> Caused by: java.rmi.ConnectIOException: non-JRMP server at remote endpoint
>       at 
> sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:230)
>       at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184)
>       at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:322)
>       at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
>       at 
> org.ops4j.pax.exam.rbc.client.RemoteBundleContextClient.getRemoteBundleContext(RemoteBundleContextClient.java:302)
>       ... 25 more
> {code}
> This might be a timing issue - something related to releasing OS /socket/ 
> resources, etc. We thought that several tests (test methods) run in a row 
> might cause this problem. The second Pax Exam test could be influenced by the 
> first one. This was just a hypothesis. We then reproduced this issue in the 
> first of the series of tests (test methods), so it could not be caused by 
> resources left by other tests.
> This issue is observed on a CI server. We also tried to verify if other 
> builds that are run on this machine do not influence Pax Exam operation. We 
> added some customizations to Pax Exam and changed the port number used for 
> RMI communication between the OSGi container and the process that launches 
> the test. Instead of 1099, a different port was used. We still observed this 
> issue.
> I finally a applied a _brute force_ workaround, but I'm not sure why does Pax 
> Exam behave differently after this change was applied.
> I hooked a custom org.ops4j.pax.exam.spi.container.TestContainerFactory 
> implementation using Pax Exam's SPI mechanisms. I'm using a slightly modified 
> version of org.ops4j.pax.exam.container.def.internal.PaxRunnerTestContainer. 
> After reading Pax Exam source code and analyzing our log files I came to a 
> conclusion that there might be some issue with synchronization and timing 
> inside Pax Exam. The test container waits for a change in the state of the 
> system bundle. This mechanism works over RMI - the container connects to an 
> RMI registry that is started by the Pax Exam activator. It seems that 
> sometimes the server side of the RMI connection is not yet prepared to handle 
> any requests when waiting is being started. I added a sleep time to 
> PaxRunnerTestContainer: 
> {code}
>   long startedAt = System.currentTimeMillis();
>         URLUtils.resetURLStreamHandlerFactory();
>         Run.start( m_javaRunner, m_arguments.getArguments() );
>         LOG.info( "Test container (Pax Runner " + Info.getPaxRunnerVersion() 
> + ") started in "
>             + ( System.currentTimeMillis() - startedAt ) + " millis" );
> // XXX yah :-/
>         try
>               {
>                       Thread.sleep(15000);
>               }
>               catch (InterruptedException e1)
>               {
> // ignore
>               }
>         
>         LOG.info( "Wait for test container to finish its initialization "
>             + ( m_startTimeout == WAIT_FOREVER ? "without timing out" : "for 
> " + m_startTimeout + " millis" ) );
>         try
>         {
>             waitForState( SYSTEM_BUNDLE, Bundle.ACTIVE, m_startTimeout );
>         }
>         catch ( TimeoutException e )
>         {
>             throw new TimeoutException( "Test container did not initialize in 
> the expected time of " + m_startTimeout
>                 + " millis" );
>         }
> {code} 
> And I haven't reproduced the issue even though I ran the build 5 times. Of 
> course:
> - it's a brute force workaround that attempts to show where the problem might 
> be
> - I'm not _100%_ sure that this timing issue was the source of our problems

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.ops4j.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

_______________________________________________
general mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/general

Reply via email to