Re: unit test failure

2010-08-04 Thread Martin Waite
Hi Sergey,

I had a feeling that I was missing something important.  Thanks.

On running Ant, I tripped over the firewall trying to download Ivy.  Ant
needed to know about our internet proxy:

export ANT_OPTS=-Dhttp.proxyHost=proxy -Dhttp.proxyPort=8080

The build reported a few errors, but seems to have completed:

vm-026-lenny-mw$ ant
Buildfile: build.xml

init:

ivy-download:
  [get] Getting:
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: /home/martin/zookeeper-3.3.1/src/java/lib/ivy-2.1.0.jar

ivy-taskdef:

ivy-init:

ivy-retrieve:
[ivy:retrieve] :: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/::
[ivy:retrieve] :: loading settings :: file =
/home/martin/zookeeper-3.3.1/ivysettings.xml
[ivy:retrieve] :: resolving dependencies ::
org.apache.zookeeper#zookeeper;3.3.2-dev
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  found log4j#log4j;1.2.15 in maven2
[ivy:retrieve]  found jline#jline;0.9.94 in maven2
[ivy:retrieve] downloading
http://repo1.maven.org/maven2/log4j/log4j/1.2.15/log4j-1.2.15.jar ...
[ivy:retrieve] 
[ivy:retrieve] 
[ivy:retrieve] 
[ivy:retrieve] ..
[ivy:retrieve] ..
[ivy:retrieve] .
[ivy:retrieve] .
[ivy:retrieve] .
[ivy:retrieve] . (382kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] log4j#log4j;1.2.15!log4j.jar (15533ms)
[ivy:retrieve] downloading
http://repo1.maven.org/maven2/jline/jline/0.9.94/jline-0.9.94.jar ...
[ivy:retrieve] ...
[ivy:retrieve] .
[ivy:retrieve] . (85kB)
[ivy:retrieve] .. (0kB)
[ivy:retrieve]  [SUCCESSFUL ] jline#jline;0.9.94!jline.jar (8232ms)
[ivy:retrieve] :: resolution report :: resolve 11530ms :: artifacts dl
23769ms

-
|  |modules||   artifacts
|
|   conf   | number| search|dwnlded|evicted||
number|dwnlded|

-
|  default |   2   |   2   |   2   |   0   ||   2   |   2
|

-
[ivy:retrieve] :: retrieving :: org.apache.zookeeper#zookeeper
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  2 artifacts copied, 0 already retrieved (467kB/24ms)

clover.setup:

clover.info:

clover:

jute:

compile_jute_uptodate:

compile_jute:

ver-gen:
[javac] Compiling 1 source file to
/home/martin/zookeeper-3.3.1/build/classes

svn-revision:
[mkdir] Created dir: /home/martin/zookeeper-3.3.1/.revision
 [exec] /home/martin/zookeeper-3.3.1/src/lastRevision.sh: line 19: svn:
command not found

version-info:
 [java] All version-related parameters must be valid integers!
 [java] Exception in thread main java.lang.NumberFormatException: For
input string: 2-dev
 [java] at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 [java] at java.lang.Integer.parseInt(Integer.java:458)
 [java] at java.lang.Integer.parseInt(Integer.java:499)
 [java] at
org.apache.zookeeper.version.util.VerGen.main(VerGen.java:131)
 [java] Java Result: 1

build-generated:
[javac] Compiling 46 source files to
/home/martin/zookeeper-3.3.1/build/classes

compile:
[javac] Compiling 113 source files to
/home/martin/zookeeper-3.3.1/build/classes

jar:
  [jar] Building jar:
/home/martin/zookeeper-3.3.1/build/zookeeper-3.3.2-dev.jar

BUILD SUCCESSFUL
Total time: 59 seconds

What I do not understand now is why the process has built 3.3.2-dev within
the 3.3.1 source bundle.  Did the build download a new version ?

regards,
Martin



On 3 August 2010 18:18, Sergey Doroshenko dors...@gmail.com wrote:

 Yes. It seems you didn't build the server actually.
 In the main directory (from where you ran and compile_jute) run ant.
 This will invoke ant's default target which will build the server code.
 After that run either make run-check from src/c, or ant
 test-core-cppunit from the main dir.

 On Tue, Aug 3, 2010 at 7:51 PM, Martin Waite waite@gmail.com wrote:

  Hi Sergey,
 
  Thanks for the hints.
 
  1) I cannot find any ZK server logs.  I have no tmp directory inside
 build
  directory.  This sounds bad.
 
  2) I have run ps -ef | grep zoo.   There are no zk processes running.
 
  I wonder if I have missed some steps.
 
  So far I have:
 
1. built a new debian lenny host
2. sudo apt-get install autoconf libtool libcppunit-dev sun-java6-jdk
 ant
3. tar zxf zookeeper-3.3.1.tar.gz
4. cd zookeeper-3.3.1
5. ant compile_jute
6. cd src/c
7. autoreconf -if
8. ./configure
9. make run-check
 
  Do I have to do something before building the client code to install the
 ZK
  server

Re: unit test failure

2010-08-03 Thread Martin Waite
Hi Mahadev,

Sorry for the delay in replying:  I have been away.

I have rebuilt my debian lenny machine, and started again.

Again, I have the same problem:

~/zookeeper-3.3.1/src/c$ make run-check
make  zktest-st zktest-mt
make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
make[1]: `zktest-st' is up to date.
make[1]: `zktest-mt' is up to date.
make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
./zktest-st
./tests/zkServer.sh: line 52: kill: (9432) - No such process
 ZooKeeper server startedRunning
Zookeeper_operations::testPing : elapsed 1 : OK
Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 1 :
OK
Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 :
OK
Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK
Zookeeper_init::testBasic : elapsed 0 : OK
Zookeeper_init::testAddressResolution : elapsed 0 : OK
Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
Zookeeper_init::testNullAddressString : elapsed 0 : OK
Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK
Zookeeper_init::testNonexistentHost : elapsed 1 : OK
Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after
throwing an instance of 'CppUnit::Exception'
  what():  equality assertion failed
- Expected: -101
- Actual  : -4

make: *** [run-check] Aborted

I have checked for a stale zk process:  there are none.

Any ideas please ?

regards,
Martin



On 14 July 2010 18:37, Mahadev Konar maha...@yahoo-inc.com wrote:

 HI Martin,
  Can you check if you have a stale java process (ZooKeeperServer) running
 on your machine? That might cause some issues with the tests.


 Thanks
 mahadev


 On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote:

  Hi,
 
  I am attempting to build the C client on debian lenny.
 
  autoconf, configure, make and make install all appear to work cleanly.
 
  I ran:
 
  autoreconf -if
  ./configure
  make
  make install
  make run-check
 
  However, the unit tests fail:
 
  $ make run-check
  make  zktest-st zktest-mt
  make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
  make[1]: `zktest-st' is up to date.
  make[1]: `zktest-mt' is up to date.
  make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
  ./zktest-st
  ./tests/zkServer.sh: line 52: kill: (17711) - No such process
   ZooKeeper server startedRunning
  Zookeeper_operations::testPing : elapsed 1 : OK
  Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
  Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
  Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed
 2 :
  OK
  Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed
 0 :
  OK
  Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK
  Zookeeper_init::testBasic : elapsed 0 : OK
  Zookeeper_init::testAddressResolution : elapsed 0 : OK
  Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
  Zookeeper_init::testNullAddressString : elapsed 0 : OK
  Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
  Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
  Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
  Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
  Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK
  Zookeeper_init::testNonexistentHost : elapsed 108 : OK
  Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
  Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
  Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
  Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
  Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
  Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
  Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
  Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
  Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after
  throwing an instance of 'CppUnit::Exception'
what():  equality assertion failed
  - Expected: -101
  - Actual  : -4
 
  make: *** [run-check] Aborted
 
  This appears to come from tests/TestClient.cc - but beyond that, it is
 hard
  to identify which 

Re: unit test failure

2010-08-03 Thread Martin Waite
Hi,

A little more information.

The file TEST-Zookeeper_simpleSystem-st.txt contains some log data:

2010-08-03 13:00:13,550:11391:zoo_i...@zookeeper_init@727: Initiating client
connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x8078ed8
sessionId=0 sessionPasswd=null context=0xbfffd73c flags=0
2010-08-03 13:00:14,553:11391:zoo_er...@handle_socket_error_msg@1579: Socket
[127.0.0.1:22181] zk retcode=-4, errno=111(Connection refused): server
refused to accept the client

I don't really understand how the unit test system is meant to work.Does
it start a zk server process and then run tests against that ?  Or are the
underlying libraries being tested without using a zk server ?

regards,
Martin

On 3 August 2010 13:02, Martin Waite waite@gmail.com wrote:

 Hi Mahadev,

 Sorry for the delay in replying:  I have been away.

 I have rebuilt my debian lenny machine, and started again.

 Again, I have the same problem:

 ~/zookeeper-3.3.1/src/c$ make run-check

 make  zktest-st zktest-mt
 make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
 make[1]: `zktest-st' is up to date.
 make[1]: `zktest-mt' is up to date.
 make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
 ./zktest-st
 ./tests/zkServer.sh: line 52: kill: (9432) - No such process

  ZooKeeper server startedRunning
 Zookeeper_operations::testPing : elapsed 1 : OK
 Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
 Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
 Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 1
 : OK

 Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0
 : OK
 Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK

 Zookeeper_init::testBasic : elapsed 0 : OK
 Zookeeper_init::testAddressResolution : elapsed 0 : OK
 Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
 Zookeeper_init::testNullAddressString : elapsed 0 : OK
 Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
 Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
 Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
 Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
 Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK
 Zookeeper_init::testNonexistentHost : elapsed 1 : OK

 Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
 Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
 Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
 Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
 Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
 Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
 Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
 Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
 Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after
 throwing an instance of 'CppUnit::Exception'
   what():  equality assertion failed
 - Expected: -101
 - Actual  : -4

 make: *** [run-check] Aborted

 I have checked for a stale zk process:  there are none.

 Any ideas please ?

 regards,
 Martin




 On 14 July 2010 18:37, Mahadev Konar maha...@yahoo-inc.com wrote:

 HI Martin,
  Can you check if you have a stale java process (ZooKeeperServer) running
 on your machine? That might cause some issues with the tests.


 Thanks
 mahadev


 On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote:

  Hi,
 
  I am attempting to build the C client on debian lenny.
 
  autoconf, configure, make and make install all appear to work cleanly.
 
  I ran:
 
  autoreconf -if
  ./configure
  make
  make install
  make run-check
 
  However, the unit tests fail:
 
  $ make run-check
  make  zktest-st zktest-mt
  make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
  make[1]: `zktest-st' is up to date.
  make[1]: `zktest-mt' is up to date.
  make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
  ./zktest-st
  ./tests/zkServer.sh: line 52: kill: (17711) - No such process
   ZooKeeper server startedRunning
  Zookeeper_operations::testPing : elapsed 1 : OK
  Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
  Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
  Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed
 2 :
  OK
  Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed
 0 :
  OK
  Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK
  Zookeeper_init::testBasic : elapsed 0 : OK
  Zookeeper_init::testAddressResolution : elapsed 0 : OK
  Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
  Zookeeper_init::testNullAddressString : elapsed 0 : OK
  Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
  Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
  Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
  Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
  Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK

Re: unit test failure

2010-08-03 Thread Sergey Doroshenko
 I don't really understand how the unit test system is meant to work.

Yes, C tests start ZK server and are tested against it. There are a few that
use mocked server, but that's a legacy code I think.

Re the problem itself:
1) check ZK server logs at build/tmp/ , maybe log contains some info about
why the server fails to start
2) are you absolutely sure you have no zk processes running? I usually do
ps -ef | grep zoo and it always helps

On Tue, Aug 3, 2010 at 6:45 PM, Martin Waite waite@gmail.com wrote:

 Hi,

 A little more information.

 The file TEST-Zookeeper_simpleSystem-st.txt contains some log data:

 2010-08-03 13:00:13,550:11391:zoo_i...@zookeeper_init@727: Initiating
 client
 connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x8078ed8
 sessionId=0 sessionPasswd=null context=0xbfffd73c flags=0
 2010-08-03 13:00:14,553:11391:zoo_er...@handle_socket_error_msg@1579:
 Socket
 [127.0.0.1:22181] zk retcode=-4, errno=111(Connection refused): server
 refused to accept the client

 I don't really understand how the unit test system is meant to work.
  Does
 it start a zk server process and then run tests against that ?  Or are the
 underlying libraries being tested without using a zk server ?

 regards,
 Martin

 On 3 August 2010 13:02, Martin Waite waite@gmail.com wrote:

  Hi Mahadev,
 
  Sorry for the delay in replying:  I have been away.
 
  I have rebuilt my debian lenny machine, and started again.
 
  Again, I have the same problem:
 
  ~/zookeeper-3.3.1/src/c$ make run-check
 
  make  zktest-st zktest-mt
  make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
  make[1]: `zktest-st' is up to date.
  make[1]: `zktest-mt' is up to date.
  make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
  ./zktest-st
  ./tests/zkServer.sh: line 52: kill: (9432) - No such process
 
   ZooKeeper server startedRunning
  Zookeeper_operations::testPing : elapsed 1 : OK
  Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
  Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
  Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed
 1
  : OK
 
  Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed
 0
  : OK
  Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK
 
  Zookeeper_init::testBasic : elapsed 0 : OK
  Zookeeper_init::testAddressResolution : elapsed 0 : OK
  Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
  Zookeeper_init::testNullAddressString : elapsed 0 : OK
  Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
  Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
  Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
  Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
  Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK
  Zookeeper_init::testNonexistentHost : elapsed 1 : OK
 
  Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
  Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
  Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
  Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
  Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
  Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
  Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
  Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
  Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after
  throwing an instance of 'CppUnit::Exception'
what():  equality assertion failed
  - Expected: -101
  - Actual  : -4
 
  make: *** [run-check] Aborted
 
  I have checked for a stale zk process:  there are none.
 
  Any ideas please ?
 
  regards,
  Martin
 
 
 
 
  On 14 July 2010 18:37, Mahadev Konar maha...@yahoo-inc.com wrote:
 
  HI Martin,
   Can you check if you have a stale java process (ZooKeeperServer)
 running
  on your machine? That might cause some issues with the tests.
 
 
  Thanks
  mahadev
 
 
  On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote:
 
   Hi,
  
   I am attempting to build the C client on debian lenny.
  
   autoconf, configure, make and make install all appear to work cleanly.
  
   I ran:
  
   autoreconf -if
   ./configure
   make
   make install
   make run-check
  
   However, the unit tests fail:
  
   $ make run-check
   make  zktest-st zktest-mt
   make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
   make[1]: `zktest-st' is up to date.
   make[1]: `zktest-mt' is up to date.
   make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
   ./zktest-st
   ./tests/zkServer.sh: line 52: kill: (17711) - No such process
ZooKeeper server startedRunning
   Zookeeper_operations::testPing : elapsed 1 : OK
   Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
   Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
   Zookeeper_operations::testOperationsAndDisconnectConcurrently1 :
 elapsed
  2 :
   OK
   Zookeeper_operations

Re: unit test failure

2010-08-03 Thread Sergey Doroshenko
Yes. It seems you didn't build the server actually.
In the main directory (from where you ran and compile_jute) run ant.
This will invoke ant's default target which will build the server code.
After that run either make run-check from src/c, or ant
test-core-cppunit from the main dir.

On Tue, Aug 3, 2010 at 7:51 PM, Martin Waite waite@gmail.com wrote:

 Hi Sergey,

 Thanks for the hints.

 1) I cannot find any ZK server logs.  I have no tmp directory inside build
 directory.  This sounds bad.

 2) I have run ps -ef | grep zoo.   There are no zk processes running.

 I wonder if I have missed some steps.

 So far I have:

   1. built a new debian lenny host
   2. sudo apt-get install autoconf libtool libcppunit-dev sun-java6-jdk ant
   3. tar zxf zookeeper-3.3.1.tar.gz
   4. cd zookeeper-3.3.1
   5. ant compile_jute
   6. cd src/c
   7. autoreconf -if
   8. ./configure
   9. make run-check

 Do I have to do something before building the client code to install the ZK
 server code, or will it use the one in the unpacked tarball ?

 regards,
 Martin

 On 3 August 2010 17:14, Sergey Doroshenko dors...@gmail.com wrote:

   I don't really understand how the unit test system is meant to work.
 
  Yes, C tests start ZK server and are tested against it. There are a few
  that
  use mocked server, but that's a legacy code I think.
 
  Re the problem itself:
  1) check ZK server logs at build/tmp/ , maybe log contains some info
 about
  why the server fails to start
  2) are you absolutely sure you have no zk processes running? I usually do
  ps -ef | grep zoo and it always helps
 
  On Tue, Aug 3, 2010 at 6:45 PM, Martin Waite waite@gmail.com
 wrote:
 
   Hi,
  
   A little more information.
  
   The file TEST-Zookeeper_simpleSystem-st.txt contains some log data:
  
   2010-08-03 13:00:13,550:11391:zoo_i...@zookeeper_init@727: Initiating
   client
   connection, host=127.0.0.1:22181 sessionTimeout=1
 watcher=0x8078ed8
   sessionId=0 sessionPasswd=null context=0xbfffd73c flags=0
   2010-08-03 13:00:14,553:11391:zoo_er...@handle_socket_error_msg@1579:
   Socket
   [127.0.0.1:22181] zk retcode=-4, errno=111(Connection refused): server
   refused to accept the client
  
   I don't really understand how the unit test system is meant to work.
Does
   it start a zk server process and then run tests against that ?  Or are
  the
   underlying libraries being tested without using a zk server ?
  
   regards,
   Martin
  
   On 3 August 2010 13:02, Martin Waite waite@gmail.com wrote:
  
Hi Mahadev,
   
Sorry for the delay in replying:  I have been away.
   
I have rebuilt my debian lenny machine, and started again.
   
Again, I have the same problem:
   
~/zookeeper-3.3.1/src/c$ make run-check
   
make  zktest-st zktest-mt
make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
make[1]: `zktest-st' is up to date.
make[1]: `zktest-mt' is up to date.
make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
./zktest-st
./tests/zkServer.sh: line 52: kill: (9432) - No such process
   
 ZooKeeper server startedRunning
Zookeeper_operations::testPing : elapsed 1 : OK
Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
Zookeeper_operations::testOperationsAndDisconnectConcurrently1 :
  elapsed
   1
: OK
   
Zookeeper_operations::testOperationsAndDisconnectConcurrently2 :
  elapsed
   0
: OK
Zookeeper_operations::testConcurrentOperations1 : elapsed 203 : OK
   
Zookeeper_init::testBasic : elapsed 0 : OK
Zookeeper_init::testAddressResolution : elapsed 0 : OK
Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
Zookeeper_init::testNullAddressString : elapsed 0 : OK
Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
Zookeeper_init::testInvalidAddressString2 : elapsed 1 : OK
Zookeeper_init::testNonexistentHost : elapsed 1 : OK
   
Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called
 after
throwing an instance of 'CppUnit::Exception'
  what():  equality assertion failed
- Expected: -101
- Actual  : -4
   
make: *** [run-check] Aborted
   
I have checked for a stale zk process:  there are none

unit test failure

2010-07-14 Thread Martin Waite
Hi,

I am attempting to build the C client on debian lenny.

autoconf, configure, make and make install all appear to work cleanly.

I ran:

autoreconf -if
./configure
make
make install
make run-check

However, the unit tests fail:

$ make run-check
make  zktest-st zktest-mt
make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
make[1]: `zktest-st' is up to date.
make[1]: `zktest-mt' is up to date.
make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
./zktest-st
./tests/zkServer.sh: line 52: kill: (17711) - No such process
 ZooKeeper server startedRunning
Zookeeper_operations::testPing : elapsed 1 : OK
Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 :
OK
Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 :
OK
Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK
Zookeeper_init::testBasic : elapsed 0 : OK
Zookeeper_init::testAddressResolution : elapsed 0 : OK
Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
Zookeeper_init::testNullAddressString : elapsed 0 : OK
Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK
Zookeeper_init::testNonexistentHost : elapsed 108 : OK
Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after
throwing an instance of 'CppUnit::Exception'
  what():  equality assertion failed
- Expected: -101
- Actual  : -4

make: *** [run-check] Aborted

This appears to come from tests/TestClient.cc - but beyond that, it is hard
to identify which equality assertion failed.

Help !

regards,
Martin


Re: unit test failure

2010-07-14 Thread Mahadev Konar
HI Martin,
  Can you check if you have a stale java process (ZooKeeperServer) running
on your machine? That might cause some issues with the tests.


Thanks 
mahadev


On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote:

 Hi,
 
 I am attempting to build the C client on debian lenny.
 
 autoconf, configure, make and make install all appear to work cleanly.
 
 I ran:
 
 autoreconf -if
 ./configure
 make
 make install
 make run-check
 
 However, the unit tests fail:
 
 $ make run-check
 make  zktest-st zktest-mt
 make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
 make[1]: `zktest-st' is up to date.
 make[1]: `zktest-mt' is up to date.
 make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
 ./zktest-st
 ./tests/zkServer.sh: line 52: kill: (17711) - No such process
  ZooKeeper server startedRunning
 Zookeeper_operations::testPing : elapsed 1 : OK
 Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
 Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
 Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 :
 OK
 Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 :
 OK
 Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK
 Zookeeper_init::testBasic : elapsed 0 : OK
 Zookeeper_init::testAddressResolution : elapsed 0 : OK
 Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
 Zookeeper_init::testNullAddressString : elapsed 0 : OK
 Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
 Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
 Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
 Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
 Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK
 Zookeeper_init::testNonexistentHost : elapsed 108 : OK
 Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
 Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
 Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
 Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
 Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
 Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
 Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
 Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
 Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after
 throwing an instance of 'CppUnit::Exception'
   what():  equality assertion failed
 - Expected: -101
 - Actual  : -4
 
 make: *** [run-check] Aborted
 
 This appears to come from tests/TestClient.cc - but beyond that, it is hard
 to identify which equality assertion failed.
 
 Help !
 
 regards,
 Martin



Re: test failures in branch-3.2

2009-07-31 Thread Patrick Hunt

Hi Todd,

Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-)

In particular:
  1 committer is on vacation
  Mahadev's been out sick for multiple days
  I'm sick but trying to hang in there, but def not 100%

Hudson (CI) has been offline for effectively the past 3 weeks (that 
gates all our commits) and is just now back but flaky.


3.2 had some bugs that we are trying to address, but the afore mentioned 
issues are slowing us down. Otw we'd have all this straightened out by 
now 


At this point you should move this discussion to the dev list - Apache 
doesn't really like us to discuss code changes/futures here (user list). 
On that list you'll also see the plan for upcoming releases - I mention 
all this because we are actively working toward 3.2.1 which will include 
the JIRAs slated for that release (I'm sure you've seen).


If you can wait a bit you might be able to avoid some pain by using the 
upcoming 3.2.1 release. Once the patches land into that branch your 
issues will be resolved w/o you needing to manually apply patches, etc...



I did look at the files you attached - it looks fine so I'm not sure the 
issue. The form of this test makes it harder - we are verifying that the 
log contains sufficient information when a particular error occurs. We 
fiddle with log4j in order to do this, which means that the log you are 
including doesn't specify the problem.


Try instrumenting this test with a try/catch around the content of the 
test method (all the code in the failing method inside a big try/catch 
is what I mean). Then print the error to std out as part of the catch. 
That should shed some light. If you could debug it a bit that would help 
- because we aren't seeing this in our environment.


Again, sort of a moot point if you can wait a week or so...

Regards,

Patrick

Todd Greenwood wrote:

Inline.


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:

Starting w/ branch-3.2 (no changes) I applied patches in this order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest

fails.

2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
PortAssignment.java.

PortAssignment.java was added by Patrick as part of

ZOOKEEPER-473.patch,

which is a pretty hefty patch ( 2k lines) and touches a large

number of

files.

Hrm, those patches were probably created against the trunk. We'll have
to have separate patches for trunk and 3.2 branch on 481.

If you could update the jira with this detail (481 needs two patches,
one for each branch) that would be great!



Done.


3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails

(jvm

crashes).

473 is special (unique) in the sense that it changes log4j while the
the vm is running. In general though it's a pretty boring test and
shouldn't be failing.

Are you sure you have the right patch file? there are 2 patch files on
the JIRA for 473, make sure that you have the one from 7/16, NOT the

one

from 7/15. Check that the patch file, the correct one should NOT

contain

changes to build.xml or conf/log4j* files. If this still happens send

me

your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email
for review. I'll take a look.




I've annotated the files w/ their date while downloading:
112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch

It appears I applied the 7-16 patch, as that is the matching file size
of the patch file I applied.

If there are to be multiple patch files for multiple branches (3.2,
trunk, etc.) would it make sense to lable the patch files accordingly?

Requested files in attached tar.

-Todd


Patrick



[junit] Running

org.apache.zookeeper.server.quorum.QuorumPeerMainTest

[junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0

sec

[junit] Test

org.apache.zookeeper.server.quorum.QuorumPeerMainTest

FAILED (crashed)


Test Log

Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec

Testcase: testBadPeerAddressInQuorum took 0.004 sec
Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report
does not reflect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited

abnormally.

Please note the time in the report does not reflect the time until

the

VM exit.

-Todd

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:


[Todd] Yes, I believe address in use was the problem w/ FLETest.

I

assumed it was a timing issue w/ respect to test A not fully

RE: test failures in branch-3.2

2009-07-31 Thread Todd Greenwood
Patrick,
Thank you for the background (and I hope you and Mahadev recover
quickly).

On a plus note, I'm finding that this morning, @work rather than @home,
the tests continue to completion. However, there are other issues that
I'll bring up on the dev list, such as a requirement to have autoconf
installed, and problems in the create-cppunit-configure task that can't
exec libtoolize, fun stuff like tha.

I need to proceed with the manual patches to branch-3.2, as I am under
some time constraints to get our infrastructure deployed such that QA
can start playing with it. However, I'll switch to 3.2.1 as soon as I
can.

-Todd

 -Original Message-
 From: Patrick Hunt [mailto:ph...@apache.org]
 Sent: Friday, July 31, 2009 11:38 AM
 To: zookeeper-user@hadoop.apache.org; Todd Greenwood
 Subject: Re: test failures in branch-3.2
 
 Hi Todd,
 
 Sorry for the clutter/confusion. Usually things aren't this cumbersome
;-)
 
 In particular:
1 committer is on vacation
Mahadev's been out sick for multiple days
I'm sick but trying to hang in there, but def not 100%
 
 Hudson (CI) has been offline for effectively the past 3 weeks (that
 gates all our commits) and is just now back but flaky.
 
 3.2 had some bugs that we are trying to address, but the afore
mentioned
 issues are slowing us down. Otw we'd have all this straightened out by
 now 
 
 At this point you should move this discussion to the dev list - Apache
 doesn't really like us to discuss code changes/futures here (user
list).
 On that list you'll also see the plan for upcoming releases - I
mention
 all this because we are actively working toward 3.2.1 which will
include
 the JIRAs slated for that release (I'm sure you've seen).
 
 If you can wait a bit you might be able to avoid some pain by using
the
 upcoming 3.2.1 release. Once the patches land into that branch your
 issues will be resolved w/o you needing to manually apply patches,
etc...
 
 
 I did look at the files you attached - it looks fine so I'm not sure
the
 issue. The form of this test makes it harder - we are verifying that
the
 log contains sufficient information when a particular error occurs. We
 fiddle with log4j in order to do this, which means that the log you
are
 including doesn't specify the problem.
 
 Try instrumenting this test with a try/catch around the content of the
 test method (all the code in the failing method inside a big try/catch
 is what I mean). Then print the error to std out as part of the catch.
 That should shed some light. If you could debug it a bit that would
help
 - because we aren't seeing this in our environment.
 
 Again, sort of a moot point if you can wait a week or so...
 
 Regards,
 
 Patrick
 
 Todd Greenwood wrote:
  Inline.
 
  -Original Message-
  From: Patrick Hunt [mailto:ph...@apache.org]
  Sent: Thursday, July 30, 2009 10:57 PM
  To: zookeeper-user@hadoop.apache.org
  Subject: Re: test failures in branch-3.2
 
  Todd Greenwood wrote:
  Starting w/ branch-3.2 (no changes) I applied patches in this
order:
 
  1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest
  fails.
  2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file
-
  PortAssignment.java.
 
  PortAssignment.java was added by Patrick as part of
  ZOOKEEPER-473.patch,
  which is a pretty hefty patch ( 2k lines) and touches a large
  number of
  files.
  Hrm, those patches were probably created against the trunk. We'll
have
  to have separate patches for trunk and 3.2 branch on 481.
 
  If you could update the jira with this detail (481 needs two
patches,
  one for each branch) that would be great!
 
 
  Done.
 
  3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails
  (jvm
  crashes).
  473 is special (unique) in the sense that it changes log4j while
the
  the vm is running. In general though it's a pretty boring test and
  shouldn't be failing.
 
  Are you sure you have the right patch file? there are 2 patch files
on
  the JIRA for 473, make sure that you have the one from 7/16, NOT
the
  one
  from 7/15. Check that the patch file, the correct one should NOT
  contain
  changes to build.xml or conf/log4j* files. If this still happens
send
  me
  your build.xml, conf/log4j* and QuroumPeerMainTest.java files in
email
  for review. I'll take a look.
 
 
 
  I've annotated the files w/ their date while downloading:
  112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
  110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch
 
  It appears I applied the 7-16 patch, as that is the matching file
size
  of the patch file I applied.
 
  If there are to be multiple patch files for multiple branches (3.2,
  trunk, etc.) would it make sense to lable the patch files
accordingly?
 
  Requested files in attached tar.
 
  -Todd
 
  Patrick
 
 
  [junit] Running
  org.apache.zookeeper.server.quorum.QuorumPeerMainTest
  [junit] Running
  org.apache.zookeeper.server.quorum.QuorumPeerMainTest
  [junit] Tests run: 1, Failures: 0, Errors: 1, Time

Re: test failures in branch-3.2

2009-07-31 Thread Patrick Hunt

Todd Greenwood wrote:

On a plus note, I'm finding that this morning, @work rather than @home,
the tests continue to completion. However, there are other issues that
I'll bring up on the dev list, such as a requirement to have autoconf
installed, and problems in the create-cppunit-configure task that can't
exec libtoolize, fun stuff like tha.


Great, good to hear. At some point figuring out what's up with your 
@home would be interesting to us. :-)


Yes, there are some basic requirements such as autotool, cppunit, etc... 
but please do raise all this on the dev list.



I need to proceed with the manual patches to branch-3.2, as I am under
some time constraints to get our infrastructure deployed such that QA
can start playing with it. However, I'll switch to 3.2.1 as soon as I
can.


Understood.

Patrick


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Friday, July 31, 2009 11:38 AM
To: zookeeper-user@hadoop.apache.org; Todd Greenwood
Subject: Re: test failures in branch-3.2

Hi Todd,

Sorry for the clutter/confusion. Usually things aren't this cumbersome

;-)

In particular:
   1 committer is on vacation
   Mahadev's been out sick for multiple days
   I'm sick but trying to hang in there, but def not 100%

Hudson (CI) has been offline for effectively the past 3 weeks (that
gates all our commits) and is just now back but flaky.

3.2 had some bugs that we are trying to address, but the afore

mentioned

issues are slowing us down. Otw we'd have all this straightened out by
now 

At this point you should move this discussion to the dev list - Apache
doesn't really like us to discuss code changes/futures here (user

list).

On that list you'll also see the plan for upcoming releases - I

mention

all this because we are actively working toward 3.2.1 which will

include

the JIRAs slated for that release (I'm sure you've seen).

If you can wait a bit you might be able to avoid some pain by using

the

upcoming 3.2.1 release. Once the patches land into that branch your
issues will be resolved w/o you needing to manually apply patches,

etc...


I did look at the files you attached - it looks fine so I'm not sure

the

issue. The form of this test makes it harder - we are verifying that

the

log contains sufficient information when a particular error occurs. We
fiddle with log4j in order to do this, which means that the log you

are

including doesn't specify the problem.

Try instrumenting this test with a try/catch around the content of the
test method (all the code in the failing method inside a big try/catch
is what I mean). Then print the error to std out as part of the catch.
That should shed some light. If you could debug it a bit that would

help

- because we aren't seeing this in our environment.

Again, sort of a moot point if you can wait a week or so...

Regards,

Patrick

Todd Greenwood wrote:

Inline.


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:

Starting w/ branch-3.2 (no changes) I applied patches in this

order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest

fails.

2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file

-

PortAssignment.java.

PortAssignment.java was added by Patrick as part of

ZOOKEEPER-473.patch,

which is a pretty hefty patch ( 2k lines) and touches a large

number of

files.

Hrm, those patches were probably created against the trunk. We'll

have

to have separate patches for trunk and 3.2 branch on 481.

If you could update the jira with this detail (481 needs two

patches,

one for each branch) that would be great!


Done.


3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails

(jvm

crashes).

473 is special (unique) in the sense that it changes log4j while

the

the vm is running. In general though it's a pretty boring test and
shouldn't be failing.

Are you sure you have the right patch file? there are 2 patch files

on

the JIRA for 473, make sure that you have the one from 7/16, NOT

the

one

from 7/15. Check that the patch file, the correct one should NOT

contain

changes to build.xml or conf/log4j* files. If this still happens

send

me

your build.xml, conf/log4j* and QuroumPeerMainTest.java files in

email

for review. I'll take a look.



I've annotated the files w/ their date while downloading:
112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch

It appears I applied the 7-16 patch, as that is the matching file

size

of the patch file I applied.

If there are to be multiple patch files for multiple branches (3.2,
trunk, etc.) would it make sense to lable the patch files

accordingly?

Requested files in attached tar.

-Todd


Patrick



[junit] Running

org.apache.zookeeper.server.quorum.QuorumPeerMainTest

[junit] Running

Re: bad svn url : test-patch

2009-07-30 Thread Mahadev Konar
Hi Todd,
  Yes this happens with the branch 3.2. The test-patch  link is broken
becasuse of the hadoop split. This file is used for hudson test environment.
It isnt used anywhere else, so the svn co otherwise should be fine. We
should fix it anyways.

Thanks
mahadev


On 7/30/09 2:57 PM, Todd Greenwood to...@audiencescience.com wrote:

 FYI - looks like there is a bad url in svn...
 
 $ svn co
 http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.2
 branch-3.2
 
 ...
 Abranch-3.2/build.xml
 
 Fetching external item into 'branch-3.2/src/java/test/bin'
 svn: URL
 'http://svn.apache.org/repos/asf/hadoop/common/nightly/test-patch'
 doesn't exist
 
 This does not repro w/ 3.1:
 
 $ svn co
 http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.1
 branch-3.1
 
 -Todd
 



RE: bad svn url : test-patch

2009-07-30 Thread Todd Greenwood
Thanks Mahadev.

-Original Message-
From: Mahadev Konar [mailto:maha...@yahoo-inc.com] 
Sent: Thursday, July 30, 2009 3:00 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: bad svn url : test-patch

Hi Todd,
  Yes this happens with the branch 3.2. The test-patch  link is broken
becasuse of the hadoop split. This file is used for hudson test
environment.
It isnt used anywhere else, so the svn co otherwise should be fine. We
should fix it anyways.

Thanks
mahadev


On 7/30/09 2:57 PM, Todd Greenwood to...@audiencescience.com wrote:

 FYI - looks like there is a bad url in svn...
 
 $ svn co
 http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.2
 branch-3.2
 
 ...
 Abranch-3.2/build.xml
 
 Fetching external item into 'branch-3.2/src/java/test/bin'
 svn: URL
 'http://svn.apache.org/repos/asf/hadoop/common/nightly/test-patch'
 doesn't exist
 
 This does not repro w/ 3.1:
 
 $ svn co
 http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.1
 branch-3.1
 
 -Todd
 



Re: test failures in branch-3.2

2009-07-30 Thread Flavio Junqueira

Todd,

On Jul 30, 2009, at 5:08 PM, Todd Greenwood wrote:

The build succeeds, but not the all of the tests. In previous test  
runs,
I noticed an error in org.apache.zookeeper.test.FLETest. It was not  
able
to bind to a port or something. Now, after a machine reboot, I'm  
getting

different failures.



This issue might be fixed in trunk, but not in the 3.2 distribution.


branch-3.2 $ ant test

[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)
[junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED



HierarchicalQuorumTest is supposed to fail until you apply the patches  
I mentioned. I don't know what could have caused the crash of the jvm  
in the other one.


-Flavio


Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt
btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
conf/log4j.properties, now that I think of it perhaps not such a good 
idea :-)


If you edited cong/log4j.properties it may be causing the test to fail, 
did you do this? (if you run the test by itself using -Dtestcase does it 
always fail?)


I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492

Patrick

Patrick Hunt wrote:

Todd Greenwood wrote:

The build succeeds, but not the all of the tests. In previous test runs,
I noticed an error in org.apache.zookeeper.test.FLETest. It was not able
to bind to a port or something. Now, after a machine reboot, I'm getting
different failures. 


address in use? That's a problem in the test framework pre-3.3. In 3.3 
(current svn trunk) I fixed it but it's not in 3.2.x. This is a problem 
with the test framework though and not a real problem, it shows up 
occasionally (depends on timing).



branch-3.2 $ ant test

[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)
[junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED

Test logs for these two tests attached.


This is unusual though - looking at the log it seems that the JVM itself 
crashed for the QPMainTest! for HQT we are seeing:


junit.framework.AssertionFailedError: Threads didn't join

which Flavio mentioned to me once is possible to happen but not a real 
problem (he can elaborate).


What version of java are you using? OS, other environment that might be 
interesting? (vm? etc...) You might try looking at the jvm crash dump 
file (I think it's in /tmp)


If you run each of these two tests individually do they run? example:
ant -Dtestcase=FLENewEpochTest test-core-java


My goal here is to get to a known state (all tests succeeding or have
workarounds for the failures). Following that, I plan to apply the
patches Flavio recommended for a WAN deploy (479 and 481). After I
verify that the tests continue to run, I'll package this up and deploy
it to our WAN for testing. 


Sounds like a good plan.


So, are these known issues? Do the tests normally run en masse, or do
some of the tests hold on to resources and prevent other tests from
passing?


Typically they do run to completion, but occasionally on my machine 
(java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
random failure due to address in use, or the same didn't join that you 
saw. Usually I see this if I'm multitasking (vs just letting the tests 
run w/o using the box). As I said this is addressed in 3.3 (address 
reuse at the very least, and I haven't see the other issues).


Patrick




RE: test failures in branch-3.2

2009-07-30 Thread Todd Greenwood
No edits to conf/log4j.properties.

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 9:25 PM
To: Patrick Hunt
Cc: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
conf/log4j.properties, now that I think of it perhaps not such a good 
idea :-)

If you edited cong/log4j.properties it may be causing the test to fail, 
did you do this? (if you run the test by itself using -Dtestcase does it

always fail?)

I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492

Patrick

Patrick Hunt wrote:
 Todd Greenwood wrote:
 The build succeeds, but not the all of the tests. In previous test
runs,
 I noticed an error in org.apache.zookeeper.test.FLETest. It was not
able
 to bind to a port or something. Now, after a machine reboot, I'm
getting
 different failures. 
 
 address in use? That's a problem in the test framework pre-3.3. In
3.3 
 (current svn trunk) I fixed it but it's not in 3.2.x. This is a
problem 
 with the test framework though and not a real problem, it shows up 
 occasionally (depends on timing).
 
 branch-3.2 $ ant test

 [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
 FAILED (crashed)
 [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED

 Test logs for these two tests attached.
 
 This is unusual though - looking at the log it seems that the JVM
itself 
 crashed for the QPMainTest! for HQT we are seeing:
 
 junit.framework.AssertionFailedError: Threads didn't join
 
 which Flavio mentioned to me once is possible to happen but not a real

 problem (he can elaborate).
 
 What version of java are you using? OS, other environment that might
be 
 interesting? (vm? etc...) You might try looking at the jvm crash dump 
 file (I think it's in /tmp)
 
 If you run each of these two tests individually do they run? example:
 ant -Dtestcase=FLENewEpochTest test-core-java
 
 My goal here is to get to a known state (all tests succeeding or have
 workarounds for the failures). Following that, I plan to apply the
 patches Flavio recommended for a WAN deploy (479 and 481). After I
 verify that the tests continue to run, I'll package this up and
deploy
 it to our WAN for testing. 
 
 Sounds like a good plan.
 
 So, are these known issues? Do the tests normally run en masse, or do
 some of the tests hold on to resources and prevent other tests from
 passing?
 
 Typically they do run to completion, but occasionally on my machine 
 (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
 random failure due to address in use, or the same didn't join that
you 
 saw. Usually I see this if I'm multitasking (vs just letting the tests

 run w/o using the box). As I said this is addressed in 3.3 (address 
 reuse at the very least, and I haven't see the other issues).
 
 Patrick
 
 


Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt
well try running these two tests individually and see if they always 
fail or just occassionally. that will be a good start (and the env detail).


Patrick

Todd Greenwood wrote:

No edits to conf/log4j.properties.

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 9:25 PM

To: Patrick Hunt
Cc: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
conf/log4j.properties, now that I think of it perhaps not such a good 
idea :-)


If you edited cong/log4j.properties it may be causing the test to fail, 
did you do this? (if you run the test by itself using -Dtestcase does it


always fail?)

I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492

Patrick

Patrick Hunt wrote:

Todd Greenwood wrote:

The build succeeds, but not the all of the tests. In previous test

runs,

I noticed an error in org.apache.zookeeper.test.FLETest. It was not

able

to bind to a port or something. Now, after a machine reboot, I'm

getting
different failures. 

address in use? That's a problem in the test framework pre-3.3. In
3.3 

(current svn trunk) I fixed it but it's not in 3.2.x. This is a
problem 
with the test framework though and not a real problem, it shows up 
occasionally (depends on timing).



branch-3.2 $ ant test

[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)
[junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED

Test logs for these two tests attached.

This is unusual though - looking at the log it seems that the JVM
itself 

crashed for the QPMainTest! for HQT we are seeing:

junit.framework.AssertionFailedError: Threads didn't join

which Flavio mentioned to me once is possible to happen but not a real



problem (he can elaborate).

What version of java are you using? OS, other environment that might
be 
interesting? (vm? etc...) You might try looking at the jvm crash dump 
file (I think it's in /tmp)


If you run each of these two tests individually do they run? example:
ant -Dtestcase=FLENewEpochTest test-core-java


My goal here is to get to a known state (all tests succeeding or have
workarounds for the failures). Following that, I plan to apply the
patches Flavio recommended for a WAN deploy (479 and 481). After I
verify that the tests continue to run, I'll package this up and

deploy
it to our WAN for testing. 

Sounds like a good plan.


So, are these known issues? Do the tests normally run en masse, or do
some of the tests hold on to resources and prevent other tests from
passing?
Typically they do run to completion, but occasionally on my machine 
(java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
random failure due to address in use, or the same didn't join that
you 

saw. Usually I see this if I'm multitasking (vs just letting the tests


run w/o using the box). As I said this is addressed in 3.3 (address 
reuse at the very least, and I haven't see the other issues).


Patrick




RE: test failures in branch-3.2

2009-07-30 Thread Todd Greenwood
Patrick, inline.

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 9:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:
 The build succeeds, but not the all of the tests. In previous test
runs,
 I noticed an error in org.apache.zookeeper.test.FLETest. It was not
able
 to bind to a port or something. Now, after a machine reboot, I'm
getting
 different failures. 

address in use? That's a problem in the test framework pre-3.3. In 3.3

(current svn trunk) I fixed it but it's not in 3.2.x. This is a problem 
with the test framework though and not a real problem, it shows up 
occasionally (depends on timing).

[Todd] Yes, I believe address in use was the problem w/ FLETest. I
assumed it was a timing issue w/ respect to test A not fully releasing
resources before test B started.

 branch-3.2 $ ant test
 
 [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
 FAILED (crashed)
 [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED
 
 Test logs for these two tests attached.

This is unusual though - looking at the log it seems that the JVM itself

crashed for the QPMainTest! for HQT we are seeing:

junit.framework.AssertionFailedError: Threads didn't join

which Flavio mentioned to me once is possible to happen but not a real 
problem (he can elaborate).

What version of java are you using? OS, other environment that might be 
interesting? (vm? etc...) You might try looking at the jvm crash dump 
file (I think it's in /tmp)

[Todd] ---
$ uname -a
Linux TODDG01LT 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC
2009 x86_64 GNU/Linux

$ which java
/home/toddg/bin/x64/java/jdk1.6.0_13/bin/java

$ java -version
java version 1.6.0_13
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed mode)

Memory = 4GB
[Todd] ---

If you run each of these two tests individually do they run? example:
ant -Dtestcase=FLENewEpochTest test-core-java

[Todd] Will try this once my local build is working and report back.
I'll open a separate mail thread on applying patches.

 My goal here is to get to a known state (all tests succeeding or have
 workarounds for the failures). Following that, I plan to apply the
 patches Flavio recommended for a WAN deploy (479 and 481). After I
 verify that the tests continue to run, I'll package this up and deploy
 it to our WAN for testing. 

Sounds like a good plan.

 So, are these known issues? Do the tests normally run en masse, or do
 some of the tests hold on to resources and prevent other tests from
 passing?

Typically they do run to completion, but occasionally on my machine 
(java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
random failure due to address in use, or the same didn't join that you

saw. Usually I see this if I'm multitasking (vs just letting the tests 
run w/o using the box). As I said this is addressed in 3.3 (address 
reuse at the very least, and I haven't see the other issues).

Patrick




RE: test failures in branch-3.2

2009-07-30 Thread Todd Greenwood
Patrick/Flavio -

Starting w/ branch-3.2 (no changes) I applied patches in this order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails.
2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
PortAssignment.java.

PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch,
which is a pretty hefty patch ( 2k lines) and touches a large number of
files. 

3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm
crashes).

[junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)


Test Log

Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec 

Testcase: testBadPeerAddressInQuorum took 0.004 sec 
Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report
does not reflect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited abnormally.
Please note the time in the report does not reflect the time until the
VM exit.

-Todd

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 10:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:
 
 [Todd] Yes, I believe address in use was the problem w/ FLETest. I
 assumed it was a timing issue w/ respect to test A not fully releasing
 resources before test B started.

Might be, but actually I think it's related to this:
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html

Patrick


Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt

Todd Greenwood wrote:

Starting w/ branch-3.2 (no changes) I applied patches in this order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails.
2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
PortAssignment.java.

PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch,
which is a pretty hefty patch ( 2k lines) and touches a large number of
files. 


Hrm, those patches were probably created against the trunk. We'll have 
to have separate patches for trunk and 3.2 branch on 481.


If you could update the jira with this detail (481 needs two patches, 
one for each branch) that would be great!



3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm
crashes).


473 is special (unique) in the sense that it changes log4j while the 
the vm is running. In general though it's a pretty boring test and 
shouldn't be failing.


Are you sure you have the right patch file? there are 2 patch files on 
the JIRA for 473, make sure that you have the one from 7/16, NOT the one 
from 7/15. Check that the patch file, the correct one should NOT contain 
changes to build.xml or conf/log4j* files. If this still happens send me 
your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email 
for review. I'll take a look.


Patrick



[junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)


Test Log

Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec 

Testcase: testBadPeerAddressInQuorum took 0.004 sec 
Caused an ERROR

Forked Java VM exited abnormally. Please note the time in the report
does not reflect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited abnormally.
Please note the time in the report does not reflect the time until the
VM exit.

-Todd

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 10:13 PM

To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:


[Todd] Yes, I believe address in use was the problem w/ FLETest. I
assumed it was a timing issue w/ respect to test A not fully releasing
resources before test B started.


Might be, but actually I think it's related to this:
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html

Patrick