Re: test failures in branch-3.2

2009-07-31 Thread Patrick Hunt

Todd Greenwood wrote:

On a plus note, I'm finding that this morning, @work rather than @home,
the tests continue to completion. However, there are other issues that
I'll bring up on the dev list, such as a requirement to have autoconf
installed, and problems in the create-cppunit-configure task that can't
exec libtoolize, fun stuff like tha.


Great, good to hear. At some point figuring out what's up with your 
@home would be interesting to us. :-)


Yes, there are some basic requirements such as autotool, cppunit, etc... 
but please do raise all this on the dev list.



I need to proceed with the manual patches to branch-3.2, as I am under
some time constraints to get our infrastructure deployed such that QA
can start playing with it. However, I'll switch to 3.2.1 as soon as I
can.


Understood.

Patrick


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Friday, July 31, 2009 11:38 AM
To: zookeeper-user@hadoop.apache.org; Todd Greenwood
Subject: Re: test failures in branch-3.2

Hi Todd,

Sorry for the clutter/confusion. Usually things aren't this cumbersome

;-)

In particular:
   1 committer is on vacation
   Mahadev's been out sick for multiple days
   I'm sick but trying to hang in there, but def not 100%

Hudson (CI) has been offline for effectively the past 3 weeks (that
gates all our commits) and is just now back but flaky.

3.2 had some bugs that we are trying to address, but the afore

mentioned

issues are slowing us down. Otw we'd have all this straightened out by
now 

At this point you should move this discussion to the dev list - Apache
doesn't really like us to discuss code changes/futures here (user

list).

On that list you'll also see the plan for upcoming releases - I

mention

all this because we are actively working toward 3.2.1 which will

include

the JIRAs slated for that release (I'm sure you've seen).

If you can wait a bit you might be able to avoid some pain by using

the

upcoming 3.2.1 release. Once the patches land into that branch your
issues will be resolved w/o you needing to manually apply patches,

etc...


I did look at the files you attached - it looks fine so I'm not sure

the

issue. The form of this test makes it harder - we are verifying that

the

log contains sufficient information when a particular error occurs. We
fiddle with log4j in order to do this, which means that the log you

are

including doesn't specify the problem.

Try instrumenting this test with a try/catch around the content of the
test method (all the code in the failing method inside a big try/catch
is what I mean). Then print the error to std out as part of the catch.
That should shed some light. If you could debug it a bit that would

help

- because we aren't seeing this in our environment.

Again, sort of a moot point if you can wait a week or so...

Regards,

Patrick

Todd Greenwood wrote:

Inline.


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:

Starting w/ branch-3.2 (no changes) I applied patches in this

order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest

fails.

2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file

-

PortAssignment.java.

PortAssignment.java was added by Patrick as part of

ZOOKEEPER-473.patch,

which is a pretty hefty patch (> 2k lines) and touches a large

number of

files.

Hrm, those patches were probably created against the trunk. We'll

have

to have separate patches for trunk and 3.2 branch on 481.

If you could update the jira with this detail (481 needs two

patches,

one for each branch) that would be great!


Done.


3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails

(jvm

crashes).

473 is "special" (unique) in the sense that it changes log4j while

the

the vm is running. In general though it's a pretty boring test and
shouldn't be failing.

Are you sure you have the right patch file? there are 2 patch files

on

the JIRA for 473, make sure that you have the one from 7/16, NOT

the

one

from 7/15. Check that the patch file, the correct one should NOT

contain

changes to build.xml or conf/log4j* files. If this still happens

send

me

your build.xml, conf/log4j* and QuroumPeerMainTest.java files in

email

for review. I'll take a look.



I've annotated the files w/ their date while downloading:
112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch

It appears I applied the 7-16 patch, as that is the matching file

size

of the patch file I applied.

If there are to be multiple patch files for multiple branches (3.2,
trunk, etc.) would it make sense to lable the patch files

accordingly?

Requested files in attache

RE: test failures in branch-3.2

2009-07-31 Thread Todd Greenwood
Patrick,
Thank you for the background (and I hope you and Mahadev recover
quickly).

On a plus note, I'm finding that this morning, @work rather than @home,
the tests continue to completion. However, there are other issues that
I'll bring up on the dev list, such as a requirement to have autoconf
installed, and problems in the create-cppunit-configure task that can't
exec libtoolize, fun stuff like tha.

I need to proceed with the manual patches to branch-3.2, as I am under
some time constraints to get our infrastructure deployed such that QA
can start playing with it. However, I'll switch to 3.2.1 as soon as I
can.

-Todd

> -Original Message-
> From: Patrick Hunt [mailto:ph...@apache.org]
> Sent: Friday, July 31, 2009 11:38 AM
> To: zookeeper-user@hadoop.apache.org; Todd Greenwood
> Subject: Re: test failures in branch-3.2
> 
> Hi Todd,
> 
> Sorry for the clutter/confusion. Usually things aren't this cumbersome
;-)
> 
> In particular:
>1 committer is on vacation
>Mahadev's been out sick for multiple days
>I'm sick but trying to hang in there, but def not 100%
> 
> Hudson (CI) has been offline for effectively the past 3 weeks (that
> gates all our commits) and is just now back but flaky.
> 
> 3.2 had some bugs that we are trying to address, but the afore
mentioned
> issues are slowing us down. Otw we'd have all this straightened out by
> now 
> 
> At this point you should move this discussion to the dev list - Apache
> doesn't really like us to discuss code changes/futures here (user
list).
> On that list you'll also see the plan for upcoming releases - I
mention
> all this because we are actively working toward 3.2.1 which will
include
> the JIRAs slated for that release (I'm sure you've seen).
> 
> If you can wait a bit you might be able to avoid some pain by using
the
> upcoming 3.2.1 release. Once the patches land into that branch your
> issues will be resolved w/o you needing to manually apply patches,
etc...
> 
> 
> I did look at the files you attached - it looks fine so I'm not sure
the
> issue. The form of this test makes it harder - we are verifying that
the
> log contains sufficient information when a particular error occurs. We
> fiddle with log4j in order to do this, which means that the log you
are
> including doesn't specify the problem.
> 
> Try instrumenting this test with a try/catch around the content of the
> test method (all the code in the failing method inside a big try/catch
> is what I mean). Then print the error to std out as part of the catch.
> That should shed some light. If you could debug it a bit that would
help
> - because we aren't seeing this in our environment.
> 
> Again, sort of a moot point if you can wait a week or so...
> 
> Regards,
> 
> Patrick
> 
> Todd Greenwood wrote:
> > Inline.
> >
> >> -Original Message-
> >> From: Patrick Hunt [mailto:ph...@apache.org]
> >> Sent: Thursday, July 30, 2009 10:57 PM
> >> To: zookeeper-user@hadoop.apache.org
> >> Subject: Re: test failures in branch-3.2
> >>
> >> Todd Greenwood wrote:
> >>> Starting w/ branch-3.2 (no changes) I applied patches in this
order:
> >>>
> >>> 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest
> > fails.
> >>> 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file
-
> >>> PortAssignment.java.
> >>>
> >>> PortAssignment.java was added by Patrick as part of
> > ZOOKEEPER-473.patch,
> >>> which is a pretty hefty patch (> 2k lines) and touches a large
> > number of
> >>> files.
> >> Hrm, those patches were probably created against the trunk. We'll
have
> >> to have separate patches for trunk and 3.2 branch on 481.
> >>
> >> If you could update the jira with this detail (481 needs two
patches,
> >> one for each branch) that would be great!
> >>
> >
> > Done.
> >
> >>> 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails
> > (jvm
> >>> crashes).
> >> 473 is "special" (unique) in the sense that it changes log4j while
the
> >> the vm is running. In general though it's a pretty boring test and
> >> shouldn't be failing.
> >>
> >> Are you sure you have the right patch file? there are 2 patch files
on
> >> the JIRA for 473, make sure that you have the one from 7/16, NOT
the
> > one
> >> from 7/15. Check that the patch file, the correct one should NOT
> > contain
> >> changes to build.xml or conf/log4j* file

Re: test failures in branch-3.2

2009-07-31 Thread Patrick Hunt

Hi Todd,

Sorry for the clutter/confusion. Usually things aren't this cumbersome ;-)

In particular:
  1 committer is on vacation
  Mahadev's been out sick for multiple days
  I'm sick but trying to hang in there, but def not 100%

Hudson (CI) has been offline for effectively the past 3 weeks (that 
gates all our commits) and is just now back but flaky.


3.2 had some bugs that we are trying to address, but the afore mentioned 
issues are slowing us down. Otw we'd have all this straightened out by 
now 


At this point you should move this discussion to the dev list - Apache 
doesn't really like us to discuss code changes/futures here (user list). 
On that list you'll also see the plan for upcoming releases - I mention 
all this because we are actively working toward 3.2.1 which will include 
the JIRAs slated for that release (I'm sure you've seen).


If you can wait a bit you might be able to avoid some pain by using the 
upcoming 3.2.1 release. Once the patches land into that branch your 
issues will be resolved w/o you needing to manually apply patches, etc...



I did look at the files you attached - it looks fine so I'm not sure the 
issue. The form of this test makes it harder - we are verifying that the 
log contains sufficient information when a particular error occurs. We 
fiddle with log4j in order to do this, which means that the log you are 
including doesn't specify the problem.


Try instrumenting this test with a try/catch around the content of the 
test method (all the code in the failing method inside a big try/catch 
is what I mean). Then print the error to std out as part of the catch. 
That should shed some light. If you could debug it a bit that would help 
- because we aren't seeing this in our environment.


Again, sort of a moot point if you can wait a week or so...

Regards,

Patrick

Todd Greenwood wrote:

Inline.


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:

Starting w/ branch-3.2 (no changes) I applied patches in this order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest

fails.

2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
PortAssignment.java.

PortAssignment.java was added by Patrick as part of

ZOOKEEPER-473.patch,

which is a pretty hefty patch (> 2k lines) and touches a large

number of

files.

Hrm, those patches were probably created against the trunk. We'll have
to have separate patches for trunk and 3.2 branch on 481.

If you could update the jira with this detail (481 needs two patches,
one for each branch) that would be great!



Done.


3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails

(jvm

crashes).

473 is "special" (unique) in the sense that it changes log4j while the
the vm is running. In general though it's a pretty boring test and
shouldn't be failing.

Are you sure you have the right patch file? there are 2 patch files on
the JIRA for 473, make sure that you have the one from 7/16, NOT the

one

from 7/15. Check that the patch file, the correct one should NOT

contain

changes to build.xml or conf/log4j* files. If this still happens send

me

your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email
for review. I'll take a look.




I've annotated the files w/ their date while downloading:
112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch

It appears I applied the 7-16 patch, as that is the matching file size
of the patch file I applied.

If there are to be multiple patch files for multiple branches (3.2,
trunk, etc.) would it make sense to lable the patch files accordingly?

Requested files in attached tar.

-Todd


Patrick



[junit] Running

org.apache.zookeeper.server.quorum.QuorumPeerMainTest

[junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0

sec

[junit] Test

org.apache.zookeeper.server.quorum.QuorumPeerMainTest

FAILED (crashed)


Test Log

Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec

Testcase: testBadPeerAddressInQuorum took 0.004 sec
Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report
does not reflect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited

abnormally.

Please note the time in the report does not reflect the time until

the

VM exit.

-Todd

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org]
Sent: Thursday, July 30, 2009 10:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:


[Todd] Yes, I believe "addr

RE: test failures in branch-3.2

2009-07-31 Thread Todd Greenwood
Inline.

> -Original Message-
> From: Patrick Hunt [mailto:ph...@apache.org]
> Sent: Thursday, July 30, 2009 10:57 PM
> To: zookeeper-user@hadoop.apache.org
> Subject: Re: test failures in branch-3.2
> 
> Todd Greenwood wrote:
> > Starting w/ branch-3.2 (no changes) I applied patches in this order:
> >
> > 1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest
fails.
> > 2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
> > PortAssignment.java.
> >
> > PortAssignment.java was added by Patrick as part of
ZOOKEEPER-473.patch,
> > which is a pretty hefty patch (> 2k lines) and touches a large
number of
> > files.
> 
> Hrm, those patches were probably created against the trunk. We'll have
> to have separate patches for trunk and 3.2 branch on 481.
> 
> If you could update the jira with this detail (481 needs two patches,
> one for each branch) that would be great!
> 

Done.

> > 3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails
(jvm
> > crashes).
> 
> 473 is "special" (unique) in the sense that it changes log4j while the
> the vm is running. In general though it's a pretty boring test and
> shouldn't be failing.
> 
> Are you sure you have the right patch file? there are 2 patch files on
> the JIRA for 473, make sure that you have the one from 7/16, NOT the
one
> from 7/15. Check that the patch file, the correct one should NOT
contain
> changes to build.xml or conf/log4j* files. If this still happens send
me
> your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email
> for review. I'll take a look.
> 


I've annotated the files w/ their date while downloading:
112700 2009-07-31 11:02 ZOOKEEPER-473-7-15.patch
110607 2009-07-31 11:01 ZOOKEEPER-473-7-16.patch

It appears I applied the 7-16 patch, as that is the matching file size
of the patch file I applied.

If there are to be multiple patch files for multiple branches (3.2,
trunk, etc.) would it make sense to lable the patch files accordingly?

Requested files in attached tar.

-Todd

> Patrick
> 
> 
> > [junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> > [junit] Running
> > org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0
sec
> > [junit] Test
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> > FAILED (crashed)
> >
> > 
> > Test Log
> > 
> > Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> > Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
> >
> > Testcase: testBadPeerAddressInQuorum took 0.004 sec
> > Caused an ERROR
> > Forked Java VM exited abnormally. Please note the time in the report
> > does not reflect the time until the VM exit.
> > junit.framework.AssertionFailedError: Forked Java VM exited
abnormally.
> > Please note the time in the report does not reflect the time until
the
> > VM exit.
> >
> > -Todd
> >
> > -Original Message-
> > From: Patrick Hunt [mailto:ph...@apache.org]
> > Sent: Thursday, July 30, 2009 10:13 PM
> > To: zookeeper-user@hadoop.apache.org
> > Subject: Re: test failures in branch-3.2
> >
> > Todd Greenwood wrote:
> >> 
> >> [Todd] Yes, I believe "address in use" was the problem w/ FLETest.
I
> >> assumed it was a timing issue w/ respect to test A not fully
releasing
> >> resources before test B started.
> >
> > Might be, but actually I think it's related to this:
> > http://hea-www.harvard.edu/~fine/Tech/addrinuse.html
> >
> > Patrick


patch-verification-473.tar.gz
Description: patch-verification-473.tar.gz


Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt

Todd Greenwood wrote:

Starting w/ branch-3.2 (no changes) I applied patches in this order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails.
2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
PortAssignment.java.

PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch,
which is a pretty hefty patch (> 2k lines) and touches a large number of
files. 


Hrm, those patches were probably created against the trunk. We'll have 
to have separate patches for trunk and 3.2 branch on 481.


If you could update the jira with this detail (481 needs two patches, 
one for each branch) that would be great!



3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm
crashes).


473 is "special" (unique) in the sense that it changes log4j while the 
the vm is running. In general though it's a pretty boring test and 
shouldn't be failing.


Are you sure you have the right patch file? there are 2 patch files on 
the JIRA for 473, make sure that you have the one from 7/16, NOT the one 
from 7/15. Check that the patch file, the correct one should NOT contain 
changes to build.xml or conf/log4j* files. If this still happens send me 
your build.xml, conf/log4j* and QuroumPeerMainTest.java files in email 
for review. I'll take a look.


Patrick



[junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)


Test Log

Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec 

Testcase: testBadPeerAddressInQuorum took 0.004 sec 
Caused an ERROR

Forked Java VM exited abnormally. Please note the time in the report
does not reflect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited abnormally.
Please note the time in the report does not reflect the time until the
VM exit.

-Todd

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 10:13 PM

To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:


[Todd] Yes, I believe "address in use" was the problem w/ FLETest. I
assumed it was a timing issue w/ respect to test A not fully releasing
resources before test B started.


Might be, but actually I think it's related to this:
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html

Patrick


RE: test failures in branch-3.2

2009-07-30 Thread Todd Greenwood
Patrick/Flavio -

Starting w/ branch-3.2 (no changes) I applied patches in this order:

1. Apply ZOOKEEPER-479.patch. Builds, but HierarchicalQuorumTest fails.
2. Apply ZOOKEEPER-481.patch. Fails to build, b/c of missing file -
PortAssignment.java.

PortAssignment.java was added by Patrick as part of ZOOKEEPER-473.patch,
which is a pretty hefty patch (> 2k lines) and touches a large number of
files. 

3. Apply ZOOKEEPER-473.patch. Builds, but QuorumPeerMainTest fails (jvm
crashes).

[junit] Running org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Running
org.apache.zookeeper.server.quorum.QuorumPeerMainTest
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)


Test Log

Testsuite: org.apache.zookeeper.server.quorum.QuorumPeerMainTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec 

Testcase: testBadPeerAddressInQuorum took 0.004 sec 
Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report
does not reflect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited abnormally.
Please note the time in the report does not reflect the time until the
VM exit.

-Todd

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 10:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:
> 
> [Todd] Yes, I believe "address in use" was the problem w/ FLETest. I
> assumed it was a timing issue w/ respect to test A not fully releasing
> resources before test B started.

Might be, but actually I think it's related to this:
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html

Patrick


Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt

Todd Greenwood wrote:


[Todd] Yes, I believe "address in use" was the problem w/ FLETest. I
assumed it was a timing issue w/ respect to test A not fully releasing
resources before test B started.


Might be, but actually I think it's related to this:
http://hea-www.harvard.edu/~fine/Tech/addrinuse.html

Patrick


RE: test failures in branch-3.2

2009-07-30 Thread Todd Greenwood
Patrick, inline.

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 9:13 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

Todd Greenwood wrote:
> The build succeeds, but not the all of the tests. In previous test
runs,
> I noticed an error in org.apache.zookeeper.test.FLETest. It was not
able
> to bind to a port or something. Now, after a machine reboot, I'm
getting
> different failures. 

"address in use"? That's a problem in the test framework pre-3.3. In 3.3

(current svn trunk) I fixed it but it's not in 3.2.x. This is a problem 
with the test framework though and not a real problem, it shows up 
occasionally (depends on timing).

[Todd] Yes, I believe "address in use" was the problem w/ FLETest. I
assumed it was a timing issue w/ respect to test A not fully releasing
resources before test B started.

> branch-3.2 $ ant test
> 
> [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
> FAILED (crashed)
> [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED
> 
> Test logs for these two tests attached.

This is unusual though - looking at the log it seems that the JVM itself

crashed for the QPMainTest! for HQT we are seeing:

junit.framework.AssertionFailedError: Threads didn't join

which Flavio mentioned to me once is possible to happen but not a real 
problem (he can elaborate).

What version of java are you using? OS, other environment that might be 
interesting? (vm? etc...) You might try looking at the jvm crash dump 
file (I think it's in /tmp)

[Todd] ---
$ uname -a
Linux TODDG01LT 2.6.28-14-generic #47-Ubuntu SMP Sat Jul 25 01:19:55 UTC
2009 x86_64 GNU/Linux

$ which java
/home/toddg/bin/x64/java/jdk1.6.0_13/bin/java

$ java -version
java version "1.6.0_13"
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed mode)

Memory = 4GB
[Todd] ---

If you run each of these two tests individually do they run? example:
ant -Dtestcase=FLENewEpochTest test-core-java

[Todd] Will try this once my local build is working and report back.
I'll open a separate mail thread on applying patches.

> My goal here is to get to a known state (all tests succeeding or have
> workarounds for the failures). Following that, I plan to apply the
> patches Flavio recommended for a WAN deploy (479 and 481). After I
> verify that the tests continue to run, I'll package this up and deploy
> it to our WAN for testing. 

Sounds like a good plan.

> So, are these known issues? Do the tests normally run en masse, or do
> some of the tests hold on to resources and prevent other tests from
> passing?

Typically they do run to completion, but occasionally on my machine 
(java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
random failure due to address in use, or the same "didn't join" that you

saw. Usually I see this if I'm multitasking (vs just letting the tests 
run w/o using the box). As I said this is addressed in 3.3 (address 
reuse at the very least, and I haven't see the other issues).

Patrick




Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt
well try running these two tests individually and see if they always 
fail or just occassionally. that will be a good start (and the env detail).


Patrick

Todd Greenwood wrote:

No edits to conf/log4j.properties.

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 9:25 PM

To: Patrick Hunt
Cc: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
conf/log4j.properties, now that I think of it perhaps not such a good 
idea :-)


If you edited cong/log4j.properties it may be causing the test to fail, 
did you do this? (if you run the test by itself using -Dtestcase does it


always fail?)

I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492

Patrick

Patrick Hunt wrote:

Todd Greenwood wrote:

The build succeeds, but not the all of the tests. In previous test

runs,

I noticed an error in org.apache.zookeeper.test.FLETest. It was not

able

to bind to a port or something. Now, after a machine reboot, I'm

getting
different failures. 

"address in use"? That's a problem in the test framework pre-3.3. In
3.3 

(current svn trunk) I fixed it but it's not in 3.2.x. This is a
problem 
with the test framework though and not a real problem, it shows up 
occasionally (depends on timing).



branch-3.2 $ ant test

[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)
[junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED

Test logs for these two tests attached.

This is unusual though - looking at the log it seems that the JVM
itself 

crashed for the QPMainTest! for HQT we are seeing:

junit.framework.AssertionFailedError: Threads didn't join

which Flavio mentioned to me once is possible to happen but not a real



problem (he can elaborate).

What version of java are you using? OS, other environment that might
be 
interesting? (vm? etc...) You might try looking at the jvm crash dump 
file (I think it's in /tmp)


If you run each of these two tests individually do they run? example:
ant -Dtestcase=FLENewEpochTest test-core-java


My goal here is to get to a known state (all tests succeeding or have
workarounds for the failures). Following that, I plan to apply the
patches Flavio recommended for a WAN deploy (479 and 481). After I
verify that the tests continue to run, I'll package this up and

deploy
it to our WAN for testing. 

Sounds like a good plan.


So, are these known issues? Do the tests normally run en masse, or do
some of the tests hold on to resources and prevent other tests from
passing?
Typically they do run to completion, but occasionally on my machine 
(java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
random failure due to address in use, or the same "didn't join" that
you 

saw. Usually I see this if I'm multitasking (vs just letting the tests


run w/o using the box). As I said this is addressed in 3.3 (address 
reuse at the very least, and I haven't see the other issues).


Patrick




RE: test failures in branch-3.2

2009-07-30 Thread Todd Greenwood
No edits to conf/log4j.properties.

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, July 30, 2009 9:25 PM
To: Patrick Hunt
Cc: zookeeper-user@hadoop.apache.org
Subject: Re: test failures in branch-3.2

btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
conf/log4j.properties, now that I think of it perhaps not such a good 
idea :-)

If you edited cong/log4j.properties it may be causing the test to fail, 
did you do this? (if you run the test by itself using -Dtestcase does it

always fail?)

I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492

Patrick

Patrick Hunt wrote:
> Todd Greenwood wrote:
>> The build succeeds, but not the all of the tests. In previous test
runs,
>> I noticed an error in org.apache.zookeeper.test.FLETest. It was not
able
>> to bind to a port or something. Now, after a machine reboot, I'm
getting
>> different failures. 
> 
> "address in use"? That's a problem in the test framework pre-3.3. In
3.3 
> (current svn trunk) I fixed it but it's not in 3.2.x. This is a
problem 
> with the test framework though and not a real problem, it shows up 
> occasionally (depends on timing).
> 
>> branch-3.2 $ ant test
>>
>> [junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
>> FAILED (crashed)
>> [junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED
>>
>> Test logs for these two tests attached.
> 
> This is unusual though - looking at the log it seems that the JVM
itself 
> crashed for the QPMainTest! for HQT we are seeing:
> 
> junit.framework.AssertionFailedError: Threads didn't join
> 
> which Flavio mentioned to me once is possible to happen but not a real

> problem (he can elaborate).
> 
> What version of java are you using? OS, other environment that might
be 
> interesting? (vm? etc...) You might try looking at the jvm crash dump 
> file (I think it's in /tmp)
> 
> If you run each of these two tests individually do they run? example:
> ant -Dtestcase=FLENewEpochTest test-core-java
> 
>> My goal here is to get to a known state (all tests succeeding or have
>> workarounds for the failures). Following that, I plan to apply the
>> patches Flavio recommended for a WAN deploy (479 and 481). After I
>> verify that the tests continue to run, I'll package this up and
deploy
>> it to our WAN for testing. 
> 
> Sounds like a good plan.
> 
>> So, are these known issues? Do the tests normally run en masse, or do
>> some of the tests hold on to resources and prevent other tests from
>> passing?
> 
> Typically they do run to completion, but occasionally on my machine 
> (java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
> random failure due to address in use, or the same "didn't join" that
you 
> saw. Usually I see this if I'm multitasking (vs just letting the tests

> run w/o using the box). As I said this is addressed in 3.3 (address 
> reuse at the very least, and I haven't see the other issues).
> 
> Patrick
> 
> 


Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt
btw QuorumPeerMainTest uses the CONSOLE appender which is setup in 
conf/log4j.properties, now that I think of it perhaps not such a good 
idea :-)


If you edited cong/log4j.properties it may be causing the test to fail, 
did you do this? (if you run the test by itself using -Dtestcase does it 
always fail?)


I've entered a jira to address this:
https://issues.apache.org/jira/browse/ZOOKEEPER-492

Patrick

Patrick Hunt wrote:

Todd Greenwood wrote:

The build succeeds, but not the all of the tests. In previous test runs,
I noticed an error in org.apache.zookeeper.test.FLETest. It was not able
to bind to a port or something. Now, after a machine reboot, I'm getting
different failures. 


"address in use"? That's a problem in the test framework pre-3.3. In 3.3 
(current svn trunk) I fixed it but it's not in 3.2.x. This is a problem 
with the test framework though and not a real problem, it shows up 
occasionally (depends on timing).



branch-3.2 $ ant test

[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)
[junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED

Test logs for these two tests attached.


This is unusual though - looking at the log it seems that the JVM itself 
crashed for the QPMainTest! for HQT we are seeing:


junit.framework.AssertionFailedError: Threads didn't join

which Flavio mentioned to me once is possible to happen but not a real 
problem (he can elaborate).


What version of java are you using? OS, other environment that might be 
interesting? (vm? etc...) You might try looking at the jvm crash dump 
file (I think it's in /tmp)


If you run each of these two tests individually do they run? example:
ant -Dtestcase=FLENewEpochTest test-core-java


My goal here is to get to a known state (all tests succeeding or have
workarounds for the failures). Following that, I plan to apply the
patches Flavio recommended for a WAN deploy (479 and 481). After I
verify that the tests continue to run, I'll package this up and deploy
it to our WAN for testing. 


Sounds like a good plan.


So, are these known issues? Do the tests normally run en masse, or do
some of the tests hold on to resources and prevent other tests from
passing?


Typically they do run to completion, but occasionally on my machine 
(java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
random failure due to address in use, or the same "didn't join" that you 
saw. Usually I see this if I'm multitasking (vs just letting the tests 
run w/o using the box). As I said this is addressed in 3.3 (address 
reuse at the very least, and I haven't see the other issues).


Patrick




Re: test failures in branch-3.2

2009-07-30 Thread Patrick Hunt

Todd Greenwood wrote:

The build succeeds, but not the all of the tests. In previous test runs,
I noticed an error in org.apache.zookeeper.test.FLETest. It was not able
to bind to a port or something. Now, after a machine reboot, I'm getting
different failures. 


"address in use"? That's a problem in the test framework pre-3.3. In 3.3 
(current svn trunk) I fixed it but it's not in 3.2.x. This is a problem 
with the test framework though and not a real problem, it shows up 
occasionally (depends on timing).



branch-3.2 $ ant test

[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)
[junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED

Test logs for these two tests attached.


This is unusual though - looking at the log it seems that the JVM itself 
crashed for the QPMainTest! for HQT we are seeing:


junit.framework.AssertionFailedError: Threads didn't join

which Flavio mentioned to me once is possible to happen but not a real 
problem (he can elaborate).


What version of java are you using? OS, other environment that might be 
interesting? (vm? etc...) You might try looking at the jvm crash dump 
file (I think it's in /tmp)


If you run each of these two tests individually do they run? example:
ant -Dtestcase=FLENewEpochTest test-core-java


My goal here is to get to a known state (all tests succeeding or have
workarounds for the failures). Following that, I plan to apply the
patches Flavio recommended for a WAN deploy (479 and 481). After I
verify that the tests continue to run, I'll package this up and deploy
it to our WAN for testing. 


Sounds like a good plan.


So, are these known issues? Do the tests normally run en masse, or do
some of the tests hold on to resources and prevent other tests from
passing?


Typically they do run to completion, but occasionally on my machine 
(java 1.6, linux32bit, 1.6g single core cpu, 1gigmem) I'll get some 
random failure due to address in use, or the same "didn't join" that you 
saw. Usually I see this if I'm multitasking (vs just letting the tests 
run w/o using the box). As I said this is addressed in 3.3 (address 
reuse at the very least, and I haven't see the other issues).


Patrick




Re: test failures in branch-3.2

2009-07-30 Thread Flavio Junqueira

Todd,

On Jul 30, 2009, at 5:08 PM, Todd Greenwood wrote:

The build succeeds, but not the all of the tests. In previous test  
runs,
I noticed an error in org.apache.zookeeper.test.FLETest. It was not  
able
to bind to a port or something. Now, after a machine reboot, I'm  
getting

different failures.



This issue might be fixed in trunk, but not in the 3.2 distribution.


branch-3.2 $ ant test

[junit] Test org.apache.zookeeper.server.quorum.QuorumPeerMainTest
FAILED (crashed)
[junit] Test org.apache.zookeeper.test.HierarchicalQuorumTest FAILED



HierarchicalQuorumTest is supposed to fail until you apply the patches  
I mentioned. I don't know what could have caused the crash of the jvm  
in the other one.


-Flavio


Re: test

2008-08-12 Thread Mahadev Konar
Testing please ignore.

mahadev

On 8/12/08 3:46 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote:

> just a test, please ignore