[jira] Updated: (ZOOKEEPER-475) FLENewEpochTest failed on nightly builds.

2009-07-17 Thread Flavio Paiva Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-475:
-

Attachment: ZOOKEEPER-475.patch

Another rough patch. It does not make any changes to cnx manager, but it adds 
one case to fle. 

> FLENewEpochTest failed on nightly builds.
> -
>
> Key: ZOOKEEPER-475
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-475
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.2.0
>Reporter: Mahadev konar
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.2.1, 3.3.0
>
> Attachments: ZOOKEEPER-475.patch, ZOOKEEPER-475.patch
>
>
> THe flenewepochtest failed on one of the nightly builds -
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/377.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-477) zkCleanup.sh is flaky

2009-07-17 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-477:


Fix Version/s: 3.3.0
   3.2.1

> zkCleanup.sh is flaky
> -
>
> Key: ZOOKEEPER-477
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-477
> Project: Zookeeper
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.2.0
>Reporter: Fernando
>Assignee: Fernando
> Fix For: 3.2.1, 3.3.0
>
> Attachments: ppp
>
>
> the zkCleanup.sh script is buggy in two ways:
> 1) it doesn't actually pass through the snapshot count, so it doesn't work
> 2) it assumes that there is only dataDir, it doesn't support dataLogDir
> And it can use cleanup, so that it doesn't blindly call eval from the config 
> file..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-477) zkCleanup.sh is flaky

2009-07-17 Thread Fernando (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fernando updated ZOOKEEPER-477:
---

Attachment: ppp

patch to fix zkCleanup.sh

> zkCleanup.sh is flaky
> -
>
> Key: ZOOKEEPER-477
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-477
> Project: Zookeeper
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.2.0
>Reporter: Fernando
>Assignee: Fernando
> Fix For: 3.2.1, 3.3.0
>
> Attachments: ppp
>
>
> the zkCleanup.sh script is buggy in two ways:
> 1) it doesn't actually pass through the snapshot count, so it doesn't work
> 2) it assumes that there is only dataDir, it doesn't support dataLogDir
> And it can use cleanup, so that it doesn't blindly call eval from the config 
> file..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-477) zkCleanup.sh is flaky

2009-07-17 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-477:


Assignee: Fernando

> zkCleanup.sh is flaky
> -
>
> Key: ZOOKEEPER-477
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-477
> Project: Zookeeper
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.2.0
>Reporter: Fernando
>Assignee: Fernando
>
> the zkCleanup.sh script is buggy in two ways:
> 1) it doesn't actually pass through the snapshot count, so it doesn't work
> 2) it assumes that there is only dataDir, it doesn't support dataLogDir
> And it can use cleanup, so that it doesn't blindly call eval from the config 
> file..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-477) zkCleanup.sh is flaky

2009-07-17 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732718#action_12732718
 ] 

Mahadev konar commented on ZOOKEEPER-477:
-

fernando,
 can you please upload the patch in a file... 

just go to zookeeper-3.2.0/ directory and do an svn diff > patchfile.txt.

Then upload the file via attach file link on the left hand side of this page.

This way you will have to click on a button agreeing to donate your code to 
apache. This way we do not have any legal issues. Please do take a look at 

http://wiki.apache.org/hadoop/ZooKeeper/PoweredBy on how to contribute. 

thanks

> zkCleanup.sh is flaky
> -
>
> Key: ZOOKEEPER-477
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-477
> Project: Zookeeper
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.2.0
>Reporter: Fernando
>
> the zkCleanup.sh script is buggy in two ways:
> 1) it doesn't actually pass through the snapshot count, so it doesn't work
> 2) it assumes that there is only dataDir, it doesn't support dataLogDir
> And it can use cleanup, so that it doesn't blindly call eval from the config 
> file..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-478) support custom hostnames for client and quorum connections

2009-07-17 Thread Chris Darroch (JIRA)
support custom hostnames for client and quorum connections
--

 Key: ZOOKEEPER-478
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-478
 Project: Zookeeper
  Issue Type: Improvement
  Components: quorum, server
Affects Versions: 3.2.0
Reporter: Chris Darroch
Priority: Minor


Our system administrators would love it if we could configure ZooKeeper to 
listen for client and quorum connections on a hostname which isn't bound to the 
localhost.

Maybe there's some neat way to do this I'm not aware of already, of course, but 
it looks to me like we would need to change the two ss.socket().bind(new 
InetSocketAddress(port)); calls, one in NIOServerCnxn and one in 
QuorumCnxManager to so that they instead used InetSocketAddress(host, port).  
Obviously that implies some optional definition of a hostname in the config 
file as well and possibly on the command-line.

Does that seem like the right approach?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-477) zkCleanup.sh is flaky

2009-07-17 Thread Fernando (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732684#action_12732684
 ] 

Fernando commented on ZOOKEEPER-477:


Here is the diff/patch to apply. Yes I give all gives to Apache.

--- /export/home/fern/servers/zookeeper-3.2.0/bin/zkCleanup.sh  2009-07-01 
09:51:22.0 -0700
+++ puppet-mnt/etc/modules/zookeeper320/files/zkCleanup.sh  2009-07-17 
12:01:08.0 -0700
@@ -36,8 +36,16 @@
 
 . $ZOOBINDIR/zkEnv.sh
 
-eval `grep -e "^dataDir=" $ZOOCFG`
+ZOODATADIR=$(grep '^dataDir=' $ZOOCFG | sed -e 's/.*=//')
+ZOODATALOGDIR=$(grep '^dataLogDir=' $ZOOCFG | sed -e 's/.*=//')
 
+if [ "x${ZOODATALOGDIR}" = "x" ]
+then
 java "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" 
"-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \
  -cp $CLASSPATH $JVMFLAGS \
- org.apache.zookeeper.server.PurgeTxnLog $dataDir
+ org.apache.zookeeper.server.PurgeTxnLog $ZOODATADIR $*
+else
+java "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" 
"-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \
+ -cp $CLASSPATH $JVMFLAGS \
+ org.apache.zookeeper.server.PurgeTxnLog $ZOODATALOGDIR $ZOODATADIR $*
+fi


> zkCleanup.sh is flaky
> -
>
> Key: ZOOKEEPER-477
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-477
> Project: Zookeeper
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.2.0
>Reporter: Fernando
>
> the zkCleanup.sh script is buggy in two ways:
> 1) it doesn't actually pass through the snapshot count, so it doesn't work
> 2) it assumes that there is only dataDir, it doesn't support dataLogDir
> And it can use cleanup, so that it doesn't blindly call eval from the config 
> file..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-477) zkCleanup.sh is flaky

2009-07-17 Thread Fernando (JIRA)
zkCleanup.sh is flaky
-

 Key: ZOOKEEPER-477
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-477
 Project: Zookeeper
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.2.0
Reporter: Fernando


the zkCleanup.sh script is buggy in two ways:

1) it doesn't actually pass through the snapshot count, so it doesn't work
2) it assumes that there is only dataDir, it doesn't support dataLogDir

And it can use cleanup, so that it doesn't blindly call eval from the config 
file..


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-475) FLENewEpochTest failed on nightly builds.

2009-07-17 Thread Flavio Paiva Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-475:
-

Attachment: ZOOKEEPER-475.patch

Patch so far. 

> FLENewEpochTest failed on nightly builds.
> -
>
> Key: ZOOKEEPER-475
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-475
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.2.0
>Reporter: Mahadev konar
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.2.1, 3.3.0
>
> Attachments: ZOOKEEPER-475.patch
>
>
> THe flenewepochtest failed on one of the nightly builds -
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/377.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-311) handle small path lengths in zoo_create()

2009-07-17 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732654#action_12732654
 ] 

Mahadev konar commented on ZOOKEEPER-311:
-

the qa isnt running because the nightly builds are failing. This should be 
fixed soon and we can get the patch process back on track.

> handle small path lengths in zoo_create()
> -
>
> Key: ZOOKEEPER-311
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-311
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client
>Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1, 3.2.0
>Reporter: Chris Darroch
>Assignee: Chris Darroch
>Priority: Minor
> Fix For: 3.2.1
>
> Attachments: ZOOKEEPER-311.patch, ZOOKEEPER-311.patch
>
>
> The synchronous completion for zoo_create() contains the following code:\\
> {noformat}
> if (sc->u.str.str_len > strlen(res.path)) {
> len = strlen(res.path);
> } else {
> len = sc->u.str.str_len-1;
> }
> if (len > 0) {
> memcpy(sc->u.str.str, res.path, len);
> sc->u.str.str[len] = '\0';
> }
> {noformat}
> In the case where the max_realpath_len argument to zoo_create() is 0, none of 
> this code executes, which is OK.  In the case where max_realpath_len is 1, a 
> user might expect their buffer to be filled with a null terminator, but 
> again, nothing will happen (even if strlen(res.path) is 0, which is unlikely 
> since new node's will have paths longer than "/").
> The name of the argument to zoo_create() is also a little misleading, as is 
> its description ("the maximum length of real path you would want") in 
> zookeeper.h, and the example usage in the Programmer's Guide:
> {noformat}
> int rc = zoo_create(zh,"/xyz","value", 5, &CREATE_ONLY, ZOO_EPHEMERAL, 
> buffer, sizeof(buffer)-1);
> {noformat}
> In fact this value should be the actual length of the buffer, including space 
> for the null terminator.  If the user supplies a max_realpath_len of 10 and a 
> buffer of 11 bytes, and strlen(res.path) is 10, the code will truncate the 
> returned value to 9 bytes and put the null terminator in the second-last 
> byte, leaving the final byte of the buffer unused.
> It would be better, I think, to rename the realpath and max_realpath_len 
> arguments to something like path_buffer and path_buffer_len, akin to 
> zoo_set().  The path_buffer_len would be treated as the full length of the 
> buffer (as the code does now, in fact, but the docs suggest otherwise).
> The code in the synchronous completion could then be changed as per the 
> attached patch.
> Since this would change, slightly, the behaviour or "contract" of the API, I 
> would be inclined to suggest waiting until 4.0.0 to implement this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-475) FLENewEpochTest failed on nightly builds.

2009-07-17 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732589#action_12732589
 ] 

Patrick Hunt commented on ZOOKEEPER-475:


the nightly build failed again last night, this time due to a failure in 
HierarchicalQuorumTest

Flavio can you take a look? If it's the same issue then we're good, otw please 
open another jira. We really
need to fix these asap (to get CI and the patch process up and running again):

http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/380/testReport/org.apache.zookeeper.test/HierarchicalQuorumTest/testHierarchicalQuorum/

> FLENewEpochTest failed on nightly builds.
> -
>
> Key: ZOOKEEPER-475
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-475
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.2.0
>Reporter: Mahadev konar
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.2.1, 3.3.0
>
>
> THe flenewepochtest failed on one of the nightly builds -
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/377.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-475) FLENewEpochTest failed on nightly builds.

2009-07-17 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-475:
---

  Component/s: quorum
 Priority: Blocker  (was: Major)
Affects Version/s: 3.2.0

> FLENewEpochTest failed on nightly builds.
> -
>
> Key: ZOOKEEPER-475
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-475
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.2.0
>Reporter: Mahadev konar
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.2.1, 3.3.0
>
>
> THe flenewepochtest failed on one of the nightly builds -
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/377.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-311) handle small path lengths in zoo_create()

2009-07-17 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732581#action_12732581
 ] 

Benjamin Reed commented on ZOOKEEPER-311:
-

+1 great job chris! i likke your test cases. thanx.  now let me see if i can 
find out why qa isn't running...

> handle small path lengths in zoo_create()
> -
>
> Key: ZOOKEEPER-311
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-311
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: c client
>Affects Versions: 3.0.0, 3.0.1, 3.1.0, 3.1.1, 3.2.0
>Reporter: Chris Darroch
>Assignee: Chris Darroch
>Priority: Minor
> Fix For: 3.2.1
>
> Attachments: ZOOKEEPER-311.patch, ZOOKEEPER-311.patch
>
>
> The synchronous completion for zoo_create() contains the following code:\\
> {noformat}
> if (sc->u.str.str_len > strlen(res.path)) {
> len = strlen(res.path);
> } else {
> len = sc->u.str.str_len-1;
> }
> if (len > 0) {
> memcpy(sc->u.str.str, res.path, len);
> sc->u.str.str[len] = '\0';
> }
> {noformat}
> In the case where the max_realpath_len argument to zoo_create() is 0, none of 
> this code executes, which is OK.  In the case where max_realpath_len is 1, a 
> user might expect their buffer to be filled with a null terminator, but 
> again, nothing will happen (even if strlen(res.path) is 0, which is unlikely 
> since new node's will have paths longer than "/").
> The name of the argument to zoo_create() is also a little misleading, as is 
> its description ("the maximum length of real path you would want") in 
> zookeeper.h, and the example usage in the Programmer's Guide:
> {noformat}
> int rc = zoo_create(zh,"/xyz","value", 5, &CREATE_ONLY, ZOO_EPHEMERAL, 
> buffer, sizeof(buffer)-1);
> {noformat}
> In fact this value should be the actual length of the buffer, including space 
> for the null terminator.  If the user supplies a max_realpath_len of 10 and a 
> buffer of 11 bytes, and strlen(res.path) is 10, the code will truncate the 
> returned value to 9 bytes and put the null terminator in the second-last 
> byte, leaving the final byte of the buffer unused.
> It would be better, I think, to rename the realpath and max_realpath_len 
> arguments to something like path_buffer and path_buffer_len, akin to 
> zoo_set().  The path_buffer_len would be treated as the full length of the 
> buffer (as the code does now, in fact, but the docs suggest otherwise).
> The code in the synchronous completion could then be changed as per the 
> attached patch.
> Since this would change, slightly, the behaviour or "contract" of the API, I 
> would be inclined to suggest waiting until 4.0.0 to implement this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: ZooKeeper-trunk #380

2009-07-17 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/380/

--
[...truncated 216532 lines...]
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2009-07-17 10:47:04,357 - INFO  
[main-SendThread(localhost:11225):clientcnxn$sendthr...@869] - Attempting 
connection to server localhost/127.0.0.1:11225
[junit] 2009-07-17 10:47:04,357 - INFO  
[main-SendThread(localhost:11225):clientcnxn$sendthr...@785] - Priming 
connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:44750 
remote=localhost/127.0.0.1:11225]
[junit] 2009-07-17 10:47:04,357 - INFO  
[main-SendThread(localhost:11225):clientcnxn$sendthr...@939] - Server 
connection successful
[junit] 2009-07-17 10:47:04,358 - INFO  
[NIOServerCxn.Factory:11225:nioserverc...@587] - Connected to /127.0.0.1:44750 
lastZxid 6
[junit] 2009-07-17 10:47:04,358 - INFO  
[NIOServerCxn.Factory:11225:nioserverc...@968] - Finished init of 
0x1228851e90a valid:true
[junit] 2009-07-17 10:47:04,358 - INFO  
[NIOServerCxn.Factory:11225:nioserverc...@616] - Renewing session 
0x1228851e90a
[junit] 2009-07-17 10:47:15,410 - INFO  [main:zookee...@461] - Closing 
session: 0x1228851e90a
[junit] 2009-07-17 10:47:15,410 - INFO  [main:clientc...@1070] - Closing 
ClientCnxn for session: 0x1228851e90a
[junit] 2009-07-17 10:47:15,411 - INFO  
[ProcessThread:-1:preprequestproces...@384] - Processed session termination 
request for id: 0x1228851e90a
[junit] 2009-07-17 10:47:15,481 - INFO  [SyncThread:0:nioserverc...@837] - 
closing session:0x1228851e90a NIOServerCnxn: 
java.nio.channels.SocketChannel[connected local=/127.0.0.1:11225 
remote=/127.0.0.1:44750]
[junit] 2009-07-17 10:47:15,481 - INFO  
[main-SendThread(localhost:11225):clientcnxn$sendthr...@963] - Exception while 
closing send thread for session 0x1228851e90a : Read error rc = -1 
java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
[junit] 2009-07-17 10:47:15,582 - INFO  [main:clientc...@1056] - 
Disconnecting ClientCnxn for session: 0x1228851e90a
[junit] 2009-07-17 10:47:15,582 - INFO  [main:zookee...@469] - Session: 
0x1228851e90a closed
[junit] 2009-07-17 10:47:15,582 - INFO  
[main-EventThread:clientcnxn$eventthr...@514] - EventThread shut down
[junit] 2009-07-17 10:47:15,582 - INFO  [main:clientb...@375] - tearDown 
starting
[junit] 2009-07-17 10:47:15,583 - INFO  [main:zookee...@461] - Closing 
session: 0x1228851e90a
[junit] 2009-07-17 10:47:15,583 - INFO  [main:clientc...@1070] - Closing 
ClientCnxn for session: 0x1228851e90a
[junit] 2009-07-17 10:47:15,583 - INFO  [main:clientc...@1056] - 
Disconnecting ClientCnxn for session: 0x1228851e90a
[junit] 2009-07-17 10:47:15,583 - INFO  [main:zookee...@469] - Session: 
0x1228851e90a closed
[junit] 2009-07-17 10:47:15,583 - INFO  [main:clientb...@352] - STOPPING 
server
[junit] 2009-07-17 10:47:15,583 - INFO  
[NIOServerCxn.Factory:11225:nioservercnxn$fact...@239] - NIOServerCnxn factory 
exited run method
[junit] 2009-07-17 10:47:15,584 - INFO  [main:finalrequestproces...@283] - 
shutdown of request processor complete
[junit] 2009-07-17 10:47:15,584 - INFO  
[SyncThread:0:syncrequestproces...@134] - SyncRequestProcessor exited!
[junit] 2009-07-17 10:47:15,584 - INFO  
[ProcessThread:-1:preprequestproces...@119] - PrepRequestProcessor exited loop!
[junit] ensureOnly:[]
[junit] 2009-07-17 10:47:15,587 - INFO  [main:clientb...@391] - FINISHED 
testWatcherAutoResetDisabledWithGlobal
[junit] 2009-07-17 10:47:15,588 - INFO  [main:clientb...@330] - STARTING 
testWatcherAutoResetDisabledWithLocal
[junit] 2009-07-17 10:47:15,593 - INFO  [main:clientb...@345] - STARTING 
server
[junit] 2009-07-17 10:47:15,593 - INFO  [main:zookeeperser...@159] - 
Created server
[junit] 2009-07-17 10:47:15,593 - INFO  [main:nioservercnxn$fact...@125] - 
binding to port 11226
[junit] 2009-07-17 10:47:15,594 - INFO  [main:filetxnsnap...@208] - 
Snapshotting: 0
[junit] ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2009-07-17 10:47:15,596 - INFO  
[NIOServerCxn.Factory:11226:nioserverc...@702] - Processing stat command from 
/127.0.0.1:46703
[junit] 2009-07-17 10:47:15,596 - WARN  
[NIOServerCxn.Factory:11226:nioserverc...@498] - Exception causing close of 
session 0x0 due to java.io.IOException: Responded to info probe
[junit] 2009-07-17 10:47:15,596 - INFO  
[NIOServerCxn.Factory:11226:nioserverc...@837] - closing session:0x0 
NIOServerCnxn: java.nio.channels.SocketChannel[connected local=/127.0.0.1:11226 
remote=/127.0.0.1:46703]
[junit] expect:InMemoryDataTree
[junit] found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServ

[jira] Commented: (ZOOKEEPER-475) FLENewEpochTest failed on nightly builds.

2009-07-17 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732451#action_12732451
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-475:
--

Great catch! (I know it was hudson, but it was good that you've seen it)

The short version of the story is that the synchronization is not correct in 
QuorumCnxManager. 

The longer version is like this. From the traces, I can see the following 
sequence of messages:

* Replica 1 sends a message to itself and to Replica 2 stating that its current 
vote is for replica 1;
* Replica 2 sends a message to itself and to Replica 1 stating that its current 
vote is for replica 2;
* Replica 1 updates its vote, and sends a message to itself stating that its 
current vote is for replica 2;
* Since replica 1 has two votes for 2 in a an ensemble of 3 replicas, replica 1 
decides to follow 2.

The problem is that replica 2 does not receive a message from 1 stating that it 
changed its vote to 2, which prevents 2 from becoming a leader. Now looking 
more carefully at why that happened, you can see that when 1 tries to send a 
message to 2, QuorumCnxManager in 1 is both shutting down a connection to 2 at 
the same time that it is trying to open a new one. The incorrect 
synchronization prevents the creation of a new connection, and 1 and 2 end up 
not connected.   

> FLENewEpochTest failed on nightly builds.
> -
>
> Key: ZOOKEEPER-475
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-475
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Mahadev konar
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.2.1, 3.3.0
>
>
> THe flenewepochtest failed on one of the nightly builds -
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/377.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.