Re: unit test failure
Hi Sergey, I had a feeling that I was missing something important. Thanks. On running Ant, I tripped over the firewall trying to download Ivy. Ant needed to know about our internet proxy: export ANT_OPTS=-Dhttp.proxyHost=proxy -Dhttp.proxyPort=8080 The build reported a few errors, but seems to have completed: vm-026-lenny-mw$ ant Buildfile: build.xml init: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /home/martin/zookeeper-3.3.1/src/java/lib/ivy-2.1.0.jar ivy-taskdef: ivy-init: ivy-retrieve: [ivy:retrieve] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/:: [ivy:retrieve] :: loading settings :: file = /home/martin/zookeeper-3.3.1/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.zookeeper#zookeeper;3.3.2-dev [ivy:retrieve] confs: [default] [ivy:retrieve] found log4j#log4j;1.2.15 in maven2 [ivy:retrieve] found jline#jline;0.9.94 in maven2 [ivy:retrieve] downloading http://repo1.maven.org/maven2/log4j/log4j/1.2.15/log4j-1.2.15.jar ... [ivy:retrieve] [ivy:retrieve] [ivy:retrieve] [ivy:retrieve] .. [ivy:retrieve] .. [ivy:retrieve] . [ivy:retrieve] . [ivy:retrieve] . [ivy:retrieve] . (382kB) [ivy:retrieve] .. (0kB) [ivy:retrieve] [SUCCESSFUL ] log4j#log4j;1.2.15!log4j.jar (15533ms) [ivy:retrieve] downloading http://repo1.maven.org/maven2/jline/jline/0.9.94/jline-0.9.94.jar ... [ivy:retrieve] ... [ivy:retrieve] . [ivy:retrieve] . (85kB) [ivy:retrieve] .. (0kB) [ivy:retrieve] [SUCCESSFUL ] jline#jline;0.9.94!jline.jar (8232ms) [ivy:retrieve] :: resolution report :: resolve 11530ms :: artifacts dl 23769ms - | |modules|| artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| - | default | 2 | 2 | 2 | 0 || 2 | 2 | - [ivy:retrieve] :: retrieving :: org.apache.zookeeper#zookeeper [ivy:retrieve] confs: [default] [ivy:retrieve] 2 artifacts copied, 0 already retrieved (467kB/24ms) clover.setup: clover.info: clover: jute: compile_jute_uptodate: compile_jute: ver-gen: [javac] Compiling 1 source file to /home/martin/zookeeper-3.3.1/build/classes svn-revision: [mkdir] Created dir: /home/martin/zookeeper-3.3.1/.revision [exec] /home/martin/zookeeper-3.3.1/src/lastRevision.sh: line 19: svn: command not found version-info: [java] All version-related parameters must be valid integers! [java] Exception in thread main java.lang.NumberFormatException: For input string: 2-dev [java] at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) [java] at java.lang.Integer.parseInt(Integer.java:458) [java] at java.lang.Integer.parseInt(Integer.java:499) [java] at org.apache.zookeeper.version.util.VerGen.main(VerGen.java:131) [java] Java Result: 1 build-generated: [javac] Compiling 46 source files to /home/martin/zookeeper-3.3.1/build/classes compile: [javac] Compiling 113 source files to /home/martin/zookeeper-3.3.1/build/classes jar: [jar] Building jar: /home/martin/zookeeper-3.3.1/build/zookeeper-3.3.2-dev.jar BUILD SUCCESSFUL Total time: 59 seconds What I do not understand now is why the process has built 3.3.2-dev within the 3.3.1 source bundle. Did the build download a new version ? regards, Martin On 3 August 2010 18:18, Sergey Doroshenko dors...@gmail.com wrote: Yes. It seems you didn't build the server actually. In the main directory (from where you ran and compile_jute) run ant. This will invoke ant's default target which will build the server code. After that run either make run-check from src/c, or ant test-core-cppunit from the main dir. On Tue, Aug 3, 2010 at 7:51 PM, Martin Waite waite@gmail.com wrote: Hi Sergey, Thanks for the hints. 1) I cannot find any ZK server logs. I have no tmp directory inside build directory. This sounds bad. 2) I have run ps -ef | grep zoo. There are no zk processes running. I wonder if I have missed some steps. So far I have: 1. built a new debian lenny host 2. sudo apt-get install autoconf libtool libcppunit-dev sun-java6-jdk ant 3. tar zxf zookeeper-3.3.1.tar.gz 4. cd zookeeper-3.3.1 5. ant compile_jute 6. cd src/c 7. autoreconf -if 8. ./configure 9. make run-check Do I have to do something before building the client code to install the ZK server
Re: unit test failure
Hi Mahadev, Sorry for the delay in replying: I have been away. I have rebuilt my debian lenny machine, and started again. Again, I have the same problem: ~/zookeeper-3.3.1/src/c$ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (9432) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 1 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK Zookeeper_init::testNonexistentHost : elapsed 1 : OK Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK Zookeeper_close::testCloseUnconnected : elapsed 0 : OK Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK Zookeeper_close::testCloseConnected1 : elapsed 0 : OK Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after throwing an instance of 'CppUnit::Exception' what(): equality assertion failed - Expected: -101 - Actual : -4 make: *** [run-check] Aborted I have checked for a stale zk process: there are none. Any ideas please ? regards, Martin On 14 July 2010 18:37, Mahadev Konar maha...@yahoo-inc.com wrote: HI Martin, Can you check if you have a stale java process (ZooKeeperServer) running on your machine? That might cause some issues with the tests. Thanks mahadev On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote: Hi, I am attempting to build the C client on debian lenny. autoconf, configure, make and make install all appear to work cleanly. I ran: autoreconf -if ./configure make make install make run-check However, the unit tests fail: $ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (17711) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK Zookeeper_init::testNonexistentHost : elapsed 108 : OK Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK Zookeeper_close::testCloseUnconnected : elapsed 0 : OK Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK Zookeeper_close::testCloseConnected1 : elapsed 0 : OK Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after throwing an instance of 'CppUnit::Exception' what(): equality assertion failed - Expected: -101 - Actual : -4 make: *** [run-check] Aborted This appears to come from tests/TestClient.cc - but beyond that, it is hard to identify which
Re: unit test failure
Hi, A little more information. The file TEST-Zookeeper_simpleSystem-st.txt contains some log data: 2010-08-03 13:00:13,550:11391:zoo_i...@zookeeper_init@727: Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x8078ed8 sessionId=0 sessionPasswd=null context=0xbfffd73c flags=0 2010-08-03 13:00:14,553:11391:zoo_er...@handle_socket_error_msg@1579: Socket [127.0.0.1:22181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I don't really understand how the unit test system is meant to work.Does it start a zk server process and then run tests against that ? Or are the underlying libraries being tested without using a zk server ? regards, Martin On 3 August 2010 13:02, Martin Waite waite@gmail.com wrote: Hi Mahadev, Sorry for the delay in replying: I have been away. I have rebuilt my debian lenny machine, and started again. Again, I have the same problem: ~/zookeeper-3.3.1/src/c$ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (9432) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 1 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK Zookeeper_init::testNonexistentHost : elapsed 1 : OK Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK Zookeeper_close::testCloseUnconnected : elapsed 0 : OK Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK Zookeeper_close::testCloseConnected1 : elapsed 0 : OK Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after throwing an instance of 'CppUnit::Exception' what(): equality assertion failed - Expected: -101 - Actual : -4 make: *** [run-check] Aborted I have checked for a stale zk process: there are none. Any ideas please ? regards, Martin On 14 July 2010 18:37, Mahadev Konar maha...@yahoo-inc.com wrote: HI Martin, Can you check if you have a stale java process (ZooKeeperServer) running on your machine? That might cause some issues with the tests. Thanks mahadev On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote: Hi, I am attempting to build the C client on debian lenny. autoconf, configure, make and make install all appear to work cleanly. I ran: autoreconf -if ./configure make make install make run-check However, the unit tests fail: $ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (17711) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK
Re: unit test failure
I don't really understand how the unit test system is meant to work. Yes, C tests start ZK server and are tested against it. There are a few that use mocked server, but that's a legacy code I think. Re the problem itself: 1) check ZK server logs at build/tmp/ , maybe log contains some info about why the server fails to start 2) are you absolutely sure you have no zk processes running? I usually do ps -ef | grep zoo and it always helps On Tue, Aug 3, 2010 at 6:45 PM, Martin Waite waite@gmail.com wrote: Hi, A little more information. The file TEST-Zookeeper_simpleSystem-st.txt contains some log data: 2010-08-03 13:00:13,550:11391:zoo_i...@zookeeper_init@727: Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x8078ed8 sessionId=0 sessionPasswd=null context=0xbfffd73c flags=0 2010-08-03 13:00:14,553:11391:zoo_er...@handle_socket_error_msg@1579: Socket [127.0.0.1:22181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I don't really understand how the unit test system is meant to work. Does it start a zk server process and then run tests against that ? Or are the underlying libraries being tested without using a zk server ? regards, Martin On 3 August 2010 13:02, Martin Waite waite@gmail.com wrote: Hi Mahadev, Sorry for the delay in replying: I have been away. I have rebuilt my debian lenny machine, and started again. Again, I have the same problem: ~/zookeeper-3.3.1/src/c$ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (9432) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 1 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK Zookeeper_init::testNonexistentHost : elapsed 1 : OK Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK Zookeeper_close::testCloseUnconnected : elapsed 0 : OK Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK Zookeeper_close::testCloseConnected1 : elapsed 0 : OK Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after throwing an instance of 'CppUnit::Exception' what(): equality assertion failed - Expected: -101 - Actual : -4 make: *** [run-check] Aborted I have checked for a stale zk process: there are none. Any ideas please ? regards, Martin On 14 July 2010 18:37, Mahadev Konar maha...@yahoo-inc.com wrote: HI Martin, Can you check if you have a stale java process (ZooKeeperServer) running on your machine? That might cause some issues with the tests. Thanks mahadev On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote: Hi, I am attempting to build the C client on debian lenny. autoconf, configure, make and make install all appear to work cleanly. I ran: autoreconf -if ./configure make make install make run-check However, the unit tests fail: $ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (17711) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 : OK Zookeeper_operations
Re: unit test failure
Yes. It seems you didn't build the server actually. In the main directory (from where you ran and compile_jute) run ant. This will invoke ant's default target which will build the server code. After that run either make run-check from src/c, or ant test-core-cppunit from the main dir. On Tue, Aug 3, 2010 at 7:51 PM, Martin Waite waite@gmail.com wrote: Hi Sergey, Thanks for the hints. 1) I cannot find any ZK server logs. I have no tmp directory inside build directory. This sounds bad. 2) I have run ps -ef | grep zoo. There are no zk processes running. I wonder if I have missed some steps. So far I have: 1. built a new debian lenny host 2. sudo apt-get install autoconf libtool libcppunit-dev sun-java6-jdk ant 3. tar zxf zookeeper-3.3.1.tar.gz 4. cd zookeeper-3.3.1 5. ant compile_jute 6. cd src/c 7. autoreconf -if 8. ./configure 9. make run-check Do I have to do something before building the client code to install the ZK server code, or will it use the one in the unpacked tarball ? regards, Martin On 3 August 2010 17:14, Sergey Doroshenko dors...@gmail.com wrote: I don't really understand how the unit test system is meant to work. Yes, C tests start ZK server and are tested against it. There are a few that use mocked server, but that's a legacy code I think. Re the problem itself: 1) check ZK server logs at build/tmp/ , maybe log contains some info about why the server fails to start 2) are you absolutely sure you have no zk processes running? I usually do ps -ef | grep zoo and it always helps On Tue, Aug 3, 2010 at 6:45 PM, Martin Waite waite@gmail.com wrote: Hi, A little more information. The file TEST-Zookeeper_simpleSystem-st.txt contains some log data: 2010-08-03 13:00:13,550:11391:zoo_i...@zookeeper_init@727: Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x8078ed8 sessionId=0 sessionPasswd=null context=0xbfffd73c flags=0 2010-08-03 13:00:14,553:11391:zoo_er...@handle_socket_error_msg@1579: Socket [127.0.0.1:22181] zk retcode=-4, errno=111(Connection refused): server refused to accept the client I don't really understand how the unit test system is meant to work. Does it start a zk server process and then run tests against that ? Or are the underlying libraries being tested without using a zk server ? regards, Martin On 3 August 2010 13:02, Martin Waite waite@gmail.com wrote: Hi Mahadev, Sorry for the delay in replying: I have been away. I have rebuilt my debian lenny machine, and started again. Again, I have the same problem: ~/zookeeper-3.3.1/src/c$ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (9432) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 1 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK Zookeeper_init::testNonexistentHost : elapsed 1 : OK Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK Zookeeper_close::testCloseUnconnected : elapsed 0 : OK Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK Zookeeper_close::testCloseConnected1 : elapsed 0 : OK Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after throwing an instance of 'CppUnit::Exception' what(): equality assertion failed - Expected: -101 - Actual : -4 make: *** [run-check] Aborted I have checked for a stale zk process: there are none
unit test failure
Hi, I am attempting to build the C client on debian lenny. autoconf, configure, make and make install all appear to work cleanly. I ran: autoreconf -if ./configure make make install make run-check However, the unit tests fail: $ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (17711) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK Zookeeper_init::testNonexistentHost : elapsed 108 : OK Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK Zookeeper_close::testCloseUnconnected : elapsed 0 : OK Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK Zookeeper_close::testCloseConnected1 : elapsed 0 : OK Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after throwing an instance of 'CppUnit::Exception' what(): equality assertion failed - Expected: -101 - Actual : -4 make: *** [run-check] Aborted This appears to come from tests/TestClient.cc - but beyond that, it is hard to identify which equality assertion failed. Help ! regards, Martin
Re: unit test failure
HI Martin, Can you check if you have a stale java process (ZooKeeperServer) running on your machine? That might cause some issues with the tests. Thanks mahadev On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote: Hi, I am attempting to build the C client on debian lenny. autoconf, configure, make and make install all appear to work cleanly. I ran: autoreconf -if ./configure make make install make run-check However, the unit tests fail: $ make run-check make zktest-st zktest-mt make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c' make[1]: `zktest-st' is up to date. make[1]: `zktest-mt' is up to date. make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c' ./zktest-st ./tests/zkServer.sh: line 52: kill: (17711) - No such process ZooKeeper server startedRunning Zookeeper_operations::testPing : elapsed 1 : OK Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 : OK Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 : OK Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK Zookeeper_init::testBasic : elapsed 0 : OK Zookeeper_init::testAddressResolution : elapsed 0 : OK Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK Zookeeper_init::testNullAddressString : elapsed 0 : OK Zookeeper_init::testEmptyAddressString : elapsed 0 : OK Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK Zookeeper_init::testNonexistentHost : elapsed 108 : OK Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK Zookeeper_close::testCloseUnconnected : elapsed 0 : OK Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK Zookeeper_close::testCloseConnected1 : elapsed 0 : OK Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after throwing an instance of 'CppUnit::Exception' what(): equality assertion failed - Expected: -101 - Actual : -4 make: *** [run-check] Aborted This appears to come from tests/TestClient.cc - but beyond that, it is hard to identify which equality assertion failed. Help ! regards, Martin
Re: test failures in branch-3.2
Hi Todd, Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-) In particular: 1 committer is on vacation Mahadev's been out sick for multiple days I'm sick but trying to hang in there, but def not 100% Hudson (CI) has been offline for effectively the past 3 weeks (that gates all our commits) and is just now back but flaky. 3.2 had some bugs that we are trying to address, but the afore mentioned issues are slowing us down. Otw we'd have all this straightened out by now At this point you should move this discussion to the dev list - Apache doesn't really like us to discuss code changes/futures here (user list). On that list you'll also see the plan for upcoming releases - I mention all this because we are actively working toward 3.2.1 which will include the JIRAs slated for that release (I'm sure you've seen). If you can wait a bit you might be able to avoid some pain by using the upcoming 3.2.1 release. Once the patches land into that branch your issues will be resolved w/o you needing to manually apply patches, etc... I did look at the files you attached - it looks fine so I'm not sure the issue. The form of this test makes it harder - we are verifying that the log contains sufficient information when a particular error occurs. We fiddle with log4j in order to do this, which means that the log you are including doesn't specify the problem. Try instrumenting this test with a try/catch around the content of the test method (all the code in the failing method inside a big try/catch is what I mean). Then print the error to std out as part of the catch. That should shed some light. If you could debug it a bit that would help - because we aren't seeing this in our environment. Again, sort of a moot point if you can wait a week or so... Regards, Patrick Todd Greenwood wrote: Inline. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:57 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. Hrm, those patches were probably created against the trunk. We'll have to have separate patches for trunk and 3.2 branch on 481. If you could update the jira with this detail (481 needs two patches, one for each branch) that would be great! Done. 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). 473 is special (unique) in the sense that it changes log4j while the the vm is running. In general though it's a pretty boring test and shouldn't be failing. Are you sure you have the right patch file? there are 2 patch files on the JIRA for 473, make sure that you have the one from 7/16, NOT the one from 7/15. Check that the patch file, the correct one should NOT contain changes to build.xml or conf/log4j* files. If this still happens send me your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email for review. I'll take a look. I've annotated the files w/ their date while downloading: 112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch 110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch It appears I applied the 7-16 patch, as that is the matching file size of the patch file I applied. If there are to be multiple patch files for multiple branches (3.2, trunk, etc.) would it make sense to lable the patch files accordingly? Requested files in attached tar. -Todd Patrick [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) Test Log Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec Testcase: testBadPeerAddressInQuorum took 0.004 sec Caused an ERROR Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:13 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: [Todd] Yes, I believe address in use was the problem w/ FLETest. I assumed it was a timing issue w/ respect to test A not fully
RE: test failures in branch-3.2
Patrick, Thank you for the background (and I hope you and Mahadev recover quickly). On a plus note, I'm finding that this morning, @work rather than @home, the tests continue to completion. However, there are other issues that I'll bring up on the dev list, such as a requirement to have autoconf installed, and problems in the create-cppunit-configure task that can't exec libtoolize, fun stuff like tha. I need to proceed with the manual patches to branch-3.2, as I am under some time constraints to get our infrastructure deployed such that QA can start playing with it. However, I'll switch to 3.2.1 as soon as I can. -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Friday, July 31, 2009 11:38 AM To: zookeeper-user@hadoop.apache.org; Todd Greenwood Subject: Re: test failures in branch-3.2 Hi Todd, Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-) In particular: 1 committer is on vacation Mahadev's been out sick for multiple days I'm sick but trying to hang in there, but def not 100% Hudson (CI) has been offline for effectively the past 3 weeks (that gates all our commits) and is just now back but flaky. 3.2 had some bugs that we are trying to address, but the afore mentioned issues are slowing us down. Otw we'd have all this straightened out by now At this point you should move this discussion to the dev list - Apache doesn't really like us to discuss code changes/futures here (user list). On that list you'll also see the plan for upcoming releases - I mention all this because we are actively working toward 3.2.1 which will include the JIRAs slated for that release (I'm sure you've seen). If you can wait a bit you might be able to avoid some pain by using the upcoming 3.2.1 release. Once the patches land into that branch your issues will be resolved w/o you needing to manually apply patches, etc... I did look at the files you attached - it looks fine so I'm not sure the issue. The form of this test makes it harder - we are verifying that the log contains sufficient information when a particular error occurs. We fiddle with log4j in order to do this, which means that the log you are including doesn't specify the problem. Try instrumenting this test with a try/catch around the content of the test method (all the code in the failing method inside a big try/catch is what I mean). Then print the error to std out as part of the catch. That should shed some light. If you could debug it a bit that would help - because we aren't seeing this in our environment. Again, sort of a moot point if you can wait a week or so... Regards, Patrick Todd Greenwood wrote: Inline. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:57 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. Hrm, those patches were probably created against the trunk. We'll have to have separate patches for trunk and 3.2 branch on 481. If you could update the jira with this detail (481 needs two patches, one for each branch) that would be great! Done. 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). 473 is special (unique) in the sense that it changes log4j while the the vm is running. In general though it's a pretty boring test and shouldn't be failing. Are you sure you have the right patch file? there are 2 patch files on the JIRA for 473, make sure that you have the one from 7/16, NOT the one from 7/15. Check that the patch file, the correct one should NOT contain changes to build.xml or conf/log4j* files. If this still happens send me your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email for review. I'll take a look. I've annotated the files w/ their date while downloading: 112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch 110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch It appears I applied the 7-16 patch, as that is the matching file size of the patch file I applied. If there are to be multiple patch files for multiple branches (3.2, trunk, etc.) would it make sense to lable the patch files accordingly? Requested files in attached tar. -Todd Patrick [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time
Re: test failures in branch-3.2
Todd Greenwood wrote: On a plus note, I'm finding that this morning, @work rather than @home, the tests continue to completion. However, there are other issues that I'll bring up on the dev list, such as a requirement to have autoconf installed, and problems in the create-cppunit-configure task that can't exec libtoolize, fun stuff like tha. Great, good to hear. At some point figuring out what's up with your @home would be interesting to us. :-) Yes, there are some basic requirements such as autotool, cppunit, etc... but please do raise all this on the dev list. I need to proceed with the manual patches to branch-3.2, as I am under some time constraints to get our infrastructure deployed such that QA can start playing with it. However, I'll switch to 3.2.1 as soon as I can. Understood. Patrick -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Friday, July 31, 2009 11:38 AM To: zookeeper-user@hadoop.apache.org; Todd Greenwood Subject: Re: test failures in branch-3.2 Hi Todd, Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-) In particular: 1 committer is on vacation Mahadev's been out sick for multiple days I'm sick but trying to hang in there, but def not 100% Hudson (CI) has been offline for effectively the past 3 weeks (that gates all our commits) and is just now back but flaky. 3.2 had some bugs that we are trying to address, but the afore mentioned issues are slowing us down. Otw we'd have all this straightened out by now At this point you should move this discussion to the dev list - Apache doesn't really like us to discuss code changes/futures here (user list). On that list you'll also see the plan for upcoming releases - I mention all this because we are actively working toward 3.2.1 which will include the JIRAs slated for that release (I'm sure you've seen). If you can wait a bit you might be able to avoid some pain by using the upcoming 3.2.1 release. Once the patches land into that branch your issues will be resolved w/o you needing to manually apply patches, etc... I did look at the files you attached - it looks fine so I'm not sure the issue. The form of this test makes it harder - we are verifying that the log contains sufficient information when a particular error occurs. We fiddle with log4j in order to do this, which means that the log you are including doesn't specify the problem. Try instrumenting this test with a try/catch around the content of the test method (all the code in the failing method inside a big try/catch is what I mean). Then print the error to std out as part of the catch. That should shed some light. If you could debug it a bit that would help - because we aren't seeing this in our environment. Again, sort of a moot point if you can wait a week or so... Regards, Patrick Todd Greenwood wrote: Inline. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:57 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. Hrm, those patches were probably created against the trunk. We'll have to have separate patches for trunk and 3.2 branch on 481. If you could update the jira with this detail (481 needs two patches, one for each branch) that would be great! Done. 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). 473 is special (unique) in the sense that it changes log4j while the the vm is running. In general though it's a pretty boring test and shouldn't be failing. Are you sure you have the right patch file? there are 2 patch files on the JIRA for 473, make sure that you have the one from 7/16, NOT the one from 7/15. Check that the patch file, the correct one should NOT contain changes to build.xml or conf/log4j* files. If this still happens send me your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email for review. I'll take a look. I've annotated the files w/ their date while downloading: 112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch 110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch It appears I applied the 7-16 patch, as that is the matching file size of the patch file I applied. If there are to be multiple patch files for multiple branches (3.2, trunk, etc.) would it make sense to lable the patch files accordingly? Requested files in attached tar. -Todd Patrick [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running
Re: bad svn url : test-patch
Hi Todd, Yes this happens with the branch 3.2. The test-patch link is broken becasuse of the hadoop split. This file is used for hudson test environment. It isnt used anywhere else, so the svn co otherwise should be fine. We should fix it anyways. Thanks mahadev On 7/30/09 2:57 PM, Todd Greenwood to...@audiencescience.com wrote: FYI - looks like there is a bad url in svn... $ svn co http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.2 branch-3.2 ... Abranch-3.2/build.xml Fetching external item into 'branch-3.2/src/java/test/bin' svn: URL 'http://svn.apache.org/repos/asf/hadoop/common/nightly/test-patch' doesn't exist This does not repro w/ 3.1: $ svn co http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.1 branch-3.1 -Todd
RE: bad svn url : test-patch
Thanks Mahadev. -Original Message- From: Mahadev Konar [mailto:maha...@yahoo-inc.com] Sent: Thursday, July 30, 2009 3:00 PM To: zookeeper-user@hadoop.apache.org Subject: Re: bad svn url : test-patch Hi Todd, Yes this happens with the branch 3.2. The test-patch link is broken becasuse of the hadoop split. This file is used for hudson test environment. It isnt used anywhere else, so the svn co otherwise should be fine. We should fix it anyways. Thanks mahadev On 7/30/09 2:57 PM, Todd Greenwood to...@audiencescience.com wrote: FYI - looks like there is a bad url in svn... $ svn co http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.2 branch-3.2 ... Abranch-3.2/build.xml Fetching external item into 'branch-3.2/src/java/test/bin' svn: URL 'http://svn.apache.org/repos/asf/hadoop/common/nightly/test-patch' doesn't exist This does not repro w/ 3.1: $ svn co http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.1 branch-3.1 -Todd
Re: test failures in branch-3.2
Todd, On Jul 30, 2009, at 5:08 PM, Todd Greenwood wrote: The build succeeds, but not the all of the tests. In previous test runs, I noticed an error in org.apache.zookeeper.test.FLETest. It was not able to bind to a port or something. Now, after a machine reboot, I'm getting different failures. This issue might be fixed in trunk, but not in the 3.2 distribution. branch-3.2 $ ant test [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED HierarchicalQuorumTest is supposed to fail until you apply the patches I mentioned. I don't know what could have caused the crash of the jvm in the other one. -Flavio
Re: test failures in branch-3.2
btw QuorumPeerMainTest uses the CONSOLE appender which is setup in conf/log4j.properties, now that I think of it perhaps not such a good idea :-) If you edited cong/log4j.properties it may be causing the test to fail, did you do this? (if you run the test by itself using -Dtestcase does it always fail?) I've entered a jira to address this: https://issues.apache.org/jira/browse/ZOOKEEPER-492 Patrick Patrick Hunt wrote: Todd Greenwood wrote: The build succeeds, but not the all of the tests. In previous test runs, I noticed an error in org.apache.zookeeper.test.FLETest. It was not able to bind to a port or something. Now, after a machine reboot, I'm getting different failures. address in use? That's a problem in the test framework pre-3.3. In 3.3 (current svn trunk) I fixed it but it's not in 3.2.x. This is a problem with the test framework though and not a real problem, it shows up occasionally (depends on timing). branch-3.2 $ ant test [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED Test logs for these two tests attached. This is unusual though - looking at the log it seems that the JVM itself crashed for the QPMainTest! for HQT we are seeing: junit.framework.AssertionFailedError: Threads didn't join which Flavio mentioned to me once is possible to happen but not a real problem (he can elaborate). What version of java are you using? OS, other environment that might be interesting? (vm? etc...) You might try looking at the jvm crash dump file (I think it's in /tmp) If you run each of these two tests individually do they run? example: ant -Dtestcase=FLENewEpochTest test-core-java My goal here is to get to a known state (all tests succeeding or have workarounds for the failures). Following that, I plan to apply the patches Flavio recommended for a WAN deploy (479 and 481). After I verify that the tests continue to run, I'll package this up and deploy it to our WAN for testing. Sounds like a good plan. So, are these known issues? Do the tests normally run en masse, or do some of the tests hold on to resources and prevent other tests from passing? Typically they do run to completion, but occasionally on my machine (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some random failure due to address in use, or the same didn't join that you saw. Usually I see this if I'm multitasking (vs just letting the tests run w/o using the box). As I said this is addressed in 3.3 (address reuse at the very least, and I haven't see the other issues). Patrick
RE: test failures in branch-3.2
No edits to conf/log4j.properties. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 9:25 PM To: Patrick Hunt Cc: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 btw QuorumPeerMainTest uses the CONSOLE appender which is setup in conf/log4j.properties, now that I think of it perhaps not such a good idea :-) If you edited cong/log4j.properties it may be causing the test to fail, did you do this? (if you run the test by itself using -Dtestcase does it always fail?) I've entered a jira to address this: https://issues.apache.org/jira/browse/ZOOKEEPER-492 Patrick Patrick Hunt wrote: Todd Greenwood wrote: The build succeeds, but not the all of the tests. In previous test runs, I noticed an error in org.apache.zookeeper.test.FLETest. It was not able to bind to a port or something. Now, after a machine reboot, I'm getting different failures. address in use? That's a problem in the test framework pre-3.3. In 3.3 (current svn trunk) I fixed it but it's not in 3.2.x. This is a problem with the test framework though and not a real problem, it shows up occasionally (depends on timing). branch-3.2 $ ant test [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED Test logs for these two tests attached. This is unusual though - looking at the log it seems that the JVM itself crashed for the QPMainTest! for HQT we are seeing: junit.framework.AssertionFailedError: Threads didn't join which Flavio mentioned to me once is possible to happen but not a real problem (he can elaborate). What version of java are you using? OS, other environment that might be interesting? (vm? etc...) You might try looking at the jvm crash dump file (I think it's in /tmp) If you run each of these two tests individually do they run? example: ant -Dtestcase=FLENewEpochTest test-core-java My goal here is to get to a known state (all tests succeeding or have workarounds for the failures). Following that, I plan to apply the patches Flavio recommended for a WAN deploy (479 and 481). After I verify that the tests continue to run, I'll package this up and deploy it to our WAN for testing. Sounds like a good plan. So, are these known issues? Do the tests normally run en masse, or do some of the tests hold on to resources and prevent other tests from passing? Typically they do run to completion, but occasionally on my machine (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some random failure due to address in use, or the same didn't join that you saw. Usually I see this if I'm multitasking (vs just letting the tests run w/o using the box). As I said this is addressed in 3.3 (address reuse at the very least, and I haven't see the other issues). Patrick
Re: test failures in branch-3.2
well try running these two tests individually and see if they always fail or just occassionally. that will be a good start (and the env detail). Patrick Todd Greenwood wrote: No edits to conf/log4j.properties. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 9:25 PM To: Patrick Hunt Cc: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 btw QuorumPeerMainTest uses the CONSOLE appender which is setup in conf/log4j.properties, now that I think of it perhaps not such a good idea :-) If you edited cong/log4j.properties it may be causing the test to fail, did you do this? (if you run the test by itself using -Dtestcase does it always fail?) I've entered a jira to address this: https://issues.apache.org/jira/browse/ZOOKEEPER-492 Patrick Patrick Hunt wrote: Todd Greenwood wrote: The build succeeds, but not the all of the tests. In previous test runs, I noticed an error in org.apache.zookeeper.test.FLETest. It was not able to bind to a port or something. Now, after a machine reboot, I'm getting different failures. address in use? That's a problem in the test framework pre-3.3. In 3.3 (current svn trunk) I fixed it but it's not in 3.2.x. This is a problem with the test framework though and not a real problem, it shows up occasionally (depends on timing). branch-3.2 $ ant test [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED Test logs for these two tests attached. This is unusual though - looking at the log it seems that the JVM itself crashed for the QPMainTest! for HQT we are seeing: junit.framework.AssertionFailedError: Threads didn't join which Flavio mentioned to me once is possible to happen but not a real problem (he can elaborate). What version of java are you using? OS, other environment that might be interesting? (vm? etc...) You might try looking at the jvm crash dump file (I think it's in /tmp) If you run each of these two tests individually do they run? example: ant -Dtestcase=FLENewEpochTest test-core-java My goal here is to get to a known state (all tests succeeding or have workarounds for the failures). Following that, I plan to apply the patches Flavio recommended for a WAN deploy (479 and 481). After I verify that the tests continue to run, I'll package this up and deploy it to our WAN for testing. Sounds like a good plan. So, are these known issues? Do the tests normally run en masse, or do some of the tests hold on to resources and prevent other tests from passing? Typically they do run to completion, but occasionally on my machine (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some random failure due to address in use, or the same didn't join that you saw. Usually I see this if I'm multitasking (vs just letting the tests run w/o using the box). As I said this is addressed in 3.3 (address reuse at the very least, and I haven't see the other issues). Patrick
RE: test failures in branch-3.2
Patrick, inline. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 9:13 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: The build succeeds, but not the all of the tests. In previous test runs, I noticed an error in org.apache.zookeeper.test.FLETest. It was not able to bind to a port or something. Now, after a machine reboot, I'm getting different failures. address in use? That's a problem in the test framework pre-3.3. In 3.3 (current svn trunk) I fixed it but it's not in 3.2.x. This is a problem with the test framework though and not a real problem, it shows up occasionally (depends on timing). [Todd] Yes, I believe address in use was the problem w/ FLETest. I assumed it was a timing issue w/ respect to test A not fully releasing resources before test B started. branch-3.2 $ ant test [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED Test logs for these two tests attached. This is unusual though - looking at the log it seems that the JVM itself crashed for the QPMainTest! for HQT we are seeing: junit.framework.AssertionFailedError: Threads didn't join which Flavio mentioned to me once is possible to happen but not a real problem (he can elaborate). What version of java are you using? OS, other environment that might be interesting? (vm? etc...) You might try looking at the jvm crash dump file (I think it's in /tmp) [Todd] --- $ uname -a Linux TODDG01LT 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC 2009 x86_64 GNU/Linux $ which java /home/toddg/bin/x64/java/jdk1.6.0_13/bin/java $ java -version java version 1.6.0_13 Java(TM) SE Runtime Environment (build 1.6.0_13-b03) Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed mode) Memory = 4GB [Todd] --- If you run each of these two tests individually do they run? example: ant -Dtestcase=FLENewEpochTest test-core-java [Todd] Will try this once my local build is working and report back. I'll open a separate mail thread on applying patches. My goal here is to get to a known state (all tests succeeding or have workarounds for the failures). Following that, I plan to apply the patches Flavio recommended for a WAN deploy (479 and 481). After I verify that the tests continue to run, I'll package this up and deploy it to our WAN for testing. Sounds like a good plan. So, are these known issues? Do the tests normally run en masse, or do some of the tests hold on to resources and prevent other tests from passing? Typically they do run to completion, but occasionally on my machine (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some random failure due to address in use, or the same didn't join that you saw. Usually I see this if I'm multitasking (vs just letting the tests run w/o using the box). As I said this is addressed in 3.3 (address reuse at the very least, and I haven't see the other issues). Patrick
RE: test failures in branch-3.2
Patrick/Flavio - Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) Test Log Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec Testcase: testBadPeerAddressInQuorum took 0.004 sec Caused an ERROR Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:13 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: [Todd] Yes, I believe address in use was the problem w/ FLETest. I assumed it was a timing issue w/ respect to test A not fully releasing resources before test B started. Might be, but actually I think it's related to this: http://hea-www.harvard.edu/~fine/Tech/addrinuse.html Patrick
Re: test failures in branch-3.2
Todd Greenwood wrote: Starting w/ branch-3.2 (no changes) I applied patches in this order: 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails. 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file - PortAssignment.java. PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch, which is a pretty hefty patch ( 2k lines) and touches a large number of files. Hrm, those patches were probably created against the trunk. We'll have to have separate patches for trunk and 3.2 branch on 481. If you could update the jira with this detail (481 needs two patches, one for each branch) that would be great! 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm crashes). 473 is special (unique) in the sense that it changes log4j while the the vm is running. In general though it's a pretty boring test and shouldn't be failing. Are you sure you have the right patch file? there are 2 patch files on the JIRA for 473, make sure that you have the one from 7/16, NOT the one from 7/15. Check that the patch file, the correct one should NOT contain changes to build.xml or conf/log4j* files. If this still happens send me your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email for review. I'll take a look. Patrick [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest FAILED (crashed) Test Log Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec Testcase: testBadPeerAddressInQuorum took 0.004 sec Caused an ERROR Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. -Todd -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Thursday, July 30, 2009 10:13 PM To: zookeeper-user@hadoop.apache.org Subject: Re: test failures in branch-3.2 Todd Greenwood wrote: [Todd] Yes, I believe address in use was the problem w/ FLETest. I assumed it was a timing issue w/ respect to test A not fully releasing resources before test B started. Might be, but actually I think it's related to this: http://hea-www.harvard.edu/~fine/Tech/addrinuse.html Patrick