How are the ports dynamically allocated? If five MAC are started at the same time, could they be stepping on each other?
On Fri, Jun 26, 2015 at 11:36 AM, Josh Elser <[email protected]> wrote: > Makes sense. Thanks for clarifying! > > Definitely send us a JIRA issue so we can track the issue and try to get > to the bottom of it. > > > Parise, Jonathan wrote: > >> Yes each test has a unique directory for the MAC. >> >> I use the following code to ensure that: >> >> storageDirectory_ = new File("atf-temp" + Path.SEPARATOR + >> "accumulo-test-" >> + UUID.randomUUID().toString()); >> >> Basically this makes a new random UUID so that I get very unique paths. >> All of the directories end up in atf-temp. This way I can clean them up >> easily. >> >> Jon >> >> -----Original Message----- >> From: Josh Elser [mailto:[email protected]] >> Sent: Friday, June 26, 2015 10:50 AM >> To: [email protected] >> Subject: Re: MiniAccumuloClutser Unit Test Problems >> >> Did each test class use a unique directory for MAC? >> >> Assuming they're all unique, it sounds like a bug. I know some devs run >> our suite of integration tests in parallel which use MAC out of the box. >> If all of the above is true, please feel free to open up an issue on JIRA >> and we can look into it more there. >> >> https://issues.apache.org/jira/secure/CreateIssue!default.jspa >> >> Thanks! >> >> Parise, Jonathan wrote: >> >>> Josh, >>> >>> Sorry for the confusion. I'll try and explain again with better >>> terminology. >>> >>> I have several test classes, each of which contain several test methods. >>> For each test class I have an @BeforeClass that configures and starts a >>> MAC. I also have an @AfterClass that calls MAC.stop(). >>> >>> So the flow is MAC is created, 1-n tests in the class run against it, >>> MAC is stopped, repeat for next N tests. >>> >>> This way several @test methods in the same class use the same MAC. I >>> mostly did this to make the tests run faster. I understand that having >>> several test methods share a MAC could cause test state pollution, but I am >>> careful to avoid that. >>> >>> The issues I was seeing is that if I simply ran "mvn test", the first >>> few tests would pass and then eventually one of the tests would get stuck >>> forever and just keep throwing connection errors. >>> >>> When I changed the maven configuration to force the tests to run >>> serially, the issue stopped occurring. >>> >>> I'm not sure if there tests are expected to work when run in parallel. >>> My best guess is that the tests may conflict with each other over ports or >>> something like that. >>> >>> I'd like to understand why changing the test running behavior fixed this >>> issue. Also, I think it would be good to document this somewhere. The MAC >>> javadoc and also the user's guide should provide details about using the >>> MAC for tests and properly configuring those tests. >>> >>> Does that make more sense? >>> >>> Jon >>> >>> -----Original Message----- >>> From: Josh Elser [mailto:[email protected]] >>> Sent: Friday, June 26, 2015 10:00 AM >>> To: [email protected] >>> Subject: Re: MiniAccumuloClutser Unit Test Problems >>> >>> Jonathan, >>> >>> If you're not seeing consistent behavior starting and stopping a >>> MiniAccumuloCluster repeatedly, that's a bug. If you can provide a code >>> which shows this problem, that'd be a huge help. >>> >>> If you can get a list of the processes running when you see this happen >>> and cross-reference it with what processes should be running, that would >>> also go a long way in trying to debug this. >>> >>> I am a little confused to your specific situation. You said that you >>> construct and start a MAC instance in a BeforeClass and stop it in an >>> AfterClass, but then you said that you start and stop it for each test. >>> Are you saying that after the third construction and use of a MAC, you >>> see problems? Or, are you saying that you stop and start each MAC instance >>> before you run the @Test methods? >>> >>> - Josh >>> >>> Parise, Jonathan wrote: >>> >>>> Hello, >>>> >>>> I have been writing some J-unit tests based on the >>>> MiniAccumuloCluster class. I'm experiencing some issues when several >>>> of the tests run back to back. Before I get into the error, let me >>>> explain how the tests work in general. Also, I am using Accumulo 1.6.2. >>>> >>>> Each test has an @BeforeClass method that first creates a new random >>>> directory. Then makes a new MiniAccumuloCluster instance using that >>>> directory as the dir parameter. Then, I call >>>> MiniAccumuloCluster.start(). >>>> >>>> There are several @Test methods in each test class. The typical >>>> pattern for them is that they create any necessary tables, write some >>>> data into those tables and then scan to verify it was written >>>> correctly. Basically they are testing that I can serialize and >>>> deserialize various types of Objects correctly. >>>> >>>> Then the test class as an @AfterClass method that calls >>>> MiniAccumuloCluster.stop(). It also deletes the random directory used >>>> by the previous test. >>>> >>>> The issue I am running into is that generally the first test or two >>>> run fine. However, the third test usually gets stuck in the >>>> MiniAccumuloCluster startup. It just keeps complaining about being >>>> unable to connect. Note that if the test is run independently it >>>> passes just fine. When run back to back, I see errors like this one >>>> repeatedly: >>>> >>>> 2015-06-26 09:10:24,352 INFO [main-SendThread(localhost:47046)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening >>>> socket connection to server localhost/127.0.0.1:47046 >>>> >>>> 2015-06-26 09:10:24,353 WARN [main-SendThread(localhost:47046)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session >>>> 0x14e2ffc86d80004 for server null, unexpected error, closing socket >>>> connection and attempting reconnect >>>> >>>> java.net.ConnectException: Connection refused >>>> >>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>> >>>> at >>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739 >>>> ) >>>> >>>> at >>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) >>>> >>>> 2015-06-26 09:10:24,956 INFO [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening >>>> socket connection to server localhost/127.0.0.1:10406 >>>> >>>> 2015-06-26 09:10:24,957 WARN [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session >>>> 0x14e2ffcb2390004 for server null, unexpected error, closing socket >>>> connection and attempting reconnect >>>> >>>> 2015-06-26 09:10:51,764 INFO [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening >>>> socket connection to server localhost/127.0.0.1:10406 >>>> >>>> 2015-06-26 09:10:51,765 WARN [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session >>>> 0x14e2ffcb2390004 for server null, unexpected error, closing socket >>>> connection and attempting reconnect >>>> >>>> java.net.ConnectException: Connection refused >>>> >>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>> >>>> at >>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739 >>>> ) >>>> >>>> at >>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) >>>> >>>> 2015-06-26 09:10:52,572 INFO [main-SendThread(localhost:47046)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening >>>> socket connection to server localhost/127.0.0.1:47046 >>>> >>>> 2015-06-26 09:10:52,573 WARN [main-SendThread(localhost:47046)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session >>>> 0x14e2ffc86d80004 for server null, unexpected error, closing socket >>>> connection and attempting reconnect >>>> >>>> java.net.ConnectException: Connection refused >>>> >>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>> >>>> at >>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739 >>>> ) >>>> >>>> at >>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) >>>> >>>> 2015-06-26 09:10:52,890 INFO [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening >>>> socket connection to server localhost/127.0.0.1:10406 >>>> >>>> 2015-06-26 09:10:52,891 WARN [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session >>>> 0x14e2ffcb2390004 for server null, unexpected error, closing socket >>>> connection and attempting reconnect >>>> >>>> java.net.ConnectException: Connection refused >>>> >>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>> >>>> at >>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739 >>>> ) >>>> >>>> at >>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) >>>> >>>> 2015-06-26 09:10:54,191 INFO [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening >>>> socket connection to server localhost/127.0.0.1:10406 >>>> >>>> 2015-06-26 09:10:54,192 WARN [main-SendThread(localhost:10406)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session >>>> 0x14e2ffcb2390004 for server null, unexpected error, closing socket >>>> connection and attempting reconnect >>>> >>>> java.net.ConnectException: Connection refused >>>> >>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>> >>>> at >>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739 >>>> ) >>>> >>>> at >>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1143) >>>> >>>> 2015-06-26 09:10:54,471 INFO [main-SendThread(localhost:47046)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:startConnect(1058)) - Opening >>>> socket connection to server localhost/127.0.0.1:47046 >>>> >>>> 2015-06-26 09:10:54,471 WARN [main-SendThread(localhost:47046)] >>>> zookeeper.ClientCnxn (ClientCnxn.java:run(1185)) - Session >>>> 0x14e2ffc86d80004 for server null, unexpected error, closing socket >>>> connection and attempting reconnect >>>> >>>> In the ZooKeeperServerMain.out I see lines like this repeating >>>> several thousand times: >>>> >>>> 2015-06-26 08:39:44,810 INFO [SyncThread:0] server.NIOServerCnxn >>>> (NIOServerCnxn.java:finishSessionInit(1580)) - Established session >>>> 0x14e2fe179eb0000 with negotiated timeout 30000 for client >>>> /127.0.0.1:36311 >>>> >>>> 2015-06-26 08:39:45,278 INFO [ProcessThread:-1] >>>> server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(419)) >>>> - Got user-level KeeperException when processing >>>> sessionid:0x14e2fe179eb0000 type:create cxid:0x31 >>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error >>>> Path:/accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/+r/conf >>>> Error:KeeperErrorCode = NodeExists for >>>> /accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/+r/conf >>>> >>>> 2015-06-26 08:39:45,300 INFO [ProcessThread:-1] >>>> server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(419)) >>>> - Got user-level KeeperException when processing >>>> sessionid:0x14e2fe179eb0000 type:create cxid:0x33 >>>> zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error >>>> Path:/accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/!0/conf >>>> Error:KeeperErrorCode = NodeExists for >>>> /accumulo/87e85c1f-eb40-4695-b11d-67ed88586648/tables/!0/conf >>>> >>>> I've thought about putting a Thread.sleep() call after the >>>> MiniAccumuloCluster.stop() call, but that certainly seems brittle. >>>> I'm not sure if that would improve the situation. >>>> >>>> It seems to me that the MiniAccumuloCluster does not behave well when >>>> instances are started and stopped several times. I am running the >>>> tests through Maven using the default test behavior. >>>> >>>> Could it be something with Maven? Maybe I need to be more explicit >>>> when telling it how to run the tests? >>>> >>>> Anyone have any insight into what is going wrong here? In general is >>>> my usage pattern correct for MiniAccumuloCluster? >>>> >>>> Thanks, >>>> >>>> Jon Parise >>>> >>>> Senior Software Engineer >>>> >>>> Viz | General Dynamics Missons Systems >>>> >>>> *This message and/or attachments may include information subject to >>>> GD Corporate Policies 07-103 and 07-105 and is intended to be >>>> accessed only by authorized recipients. Use, storage and transmission >>>> are governed by General Dynamics and its policies. Contractual >>>> restrictions apply to third parties. Recipients should refer to the >>>> policies or contract to determine proper handling. Unauthorized >>>> review, use, disclosure or distribution is prohibited. If you are not >>>> an intended recipient, please contact the sender and destroy all >>>> copies of the original message.* >>>> >>>>
