date:20090805


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739428#action_12739428
 ] 

Hudson commented on ZOOKEEPER-493:
--

Integrated in ZooKeeper-trunk #405 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/])
. patch for command line setquota


 patch for command line setquota 
 

 Key: ZOOKEEPER-493
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-493
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.2.0
Reporter: steve bendiola
Assignee: steve bendiola
Priority: Minor
 Fix For: 3.2.1, 3.3.0

 Attachments: quotafix.patch, ZOOKEEPER-493.patch


 the command line setquota tries to use argument 3 as both a path and a value

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-491) Prevent zero-weight servers from being elected


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739426#action_12739426
 ] 

Hudson commented on ZOOKEEPER-491:
--

Integrated in ZooKeeper-trunk #405 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/])
. Prevent zero-weight servers from being elected. (flavio via mahadev)


 Prevent zero-weight servers from being elected
 --

 Key: ZOOKEEPER-491
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-491
 Project: Zookeeper
  Issue Type: New Feature
  Components: leaderElection
Affects Versions: 3.2.0
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.2.1, 3.3.0

 Attachments: ZOOKEEPER-491-3.2branch.patch, ZOOKEEPER-491.patch


 This is a fix to prevent zero-weight servers from being elected leaders. This 
 will allow in wide-area scenarios to restrict the set of servers that can 
 lead the ensemble.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-480) FLE should perform leader check when node is not leading and add vote of follower


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739427#action_12739427
 ] 

Hudson commented on ZOOKEEPER-480:
--

Integrated in ZooKeeper-trunk #405 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/])
. FLE should perform leader check when node is not leading and add vote of 
follower (flavio via mahadev)


 FLE should perform leader check when node is not leading and add vote of 
 follower
 -

 Key: ZOOKEEPER-480
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-480
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.2.0
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.2.1, 3.3.0

 Attachments: ZOOKEEPER-480-3.2branch.patch, 
 ZOOKEEPER-480-3.2branch.patch, ZOOKEEPER-480.patch, ZOOKEEPER-480.patch, 
 ZOOKEEPER-480.patch, ZOOKEEPER-480.patch, ZOOKEEPER-480.patch


 As a server may join leader election while others have already elected a 
 leader, it is necessary that a server handles some special cases of leader 
 election when notifications are from servers that are either LEADING or 
 FOLLOWING. In such special cases, we check if we have received a message from 
 the leader to declare a leader elected. This check does not consider the case 
 that the process performing the check might be a recently elected leader, and 
 consequently the check fails.
 This patch also adds a new case, which corresponds to adding a vote to 
 recvset when the notification is from a process LEADING or FOLLOWING. This 
 fixes the case raised in ZOOKEEPER-475.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-447) zkServer.sh doesn't allow different config files to be specified on the command line


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739429#action_12739429
 ] 

Hudson commented on ZOOKEEPER-447:
--

Integrated in ZooKeeper-trunk #405 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/405/])
. zkServer.sh doesn't allow different config files to be specified on the 
command line


 zkServer.sh doesn't allow different config files to be specified on the 
 command line
 

 Key: ZOOKEEPER-447
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-447
 Project: Zookeeper
  Issue Type: Improvement
Affects Versions: 3.1.1, 3.2.0
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Minor
 Fix For: 3.2.1, 3.3.0

 Attachments: ZOOKEEPER-447.patch


 Unless I'm missing something, you can change the directory that the zoo.cfg 
 file is in by setting ZOOCFGDIR but not the name of the file itself.
 I find it convenient myself to specify the config file on the command line, 
 but we should also let it be specified by environment variable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-484) Clients get SESSION MOVED exception when switching from follower to a leader.

2009-08-05 Thread Giridharan Kesavan (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated ZOOKEEPER-484:
-

Status: Open  (was: Patch Available)

resubmitting the patch to the patch queue.

 Clients get SESSION MOVED exception when switching from follower to a leader.
 -

 Key: ZOOKEEPER-484
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-484
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.2.0
Reporter: Mahadev konar
Assignee: Mahadev konar
Priority: Blocker
 Fix For: 3.2.1, 3.3.0

 Attachments: sessionTest.patch, ZOOKEEPER-484.patch


 When a client is connected to follower and get disconnected and connects to a 
 leader it gets SESSION MOVED excpetion. This is beacuse of a bug in the new 
 feature of ZOOKEEPER-417 that we added in 3.2. All the releases before 3.2 DO 
 NOT have this problem. The fix is to make sure the ownership of a connection 
 gets changed when a session moves from follower to the leader. The workaround 
 to it in 3.2.0 would be to swithc off connection from clients to the leader. 
 take a look at *leaderServers* java property in 
 http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperAdmin.html.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-484) Clients get SESSION MOVED exception when switching from follower to a leader.

2009-08-05 Thread Giridharan Kesavan (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated ZOOKEEPER-484:
-

Status: Patch Available  (was: Open)

 Clients get SESSION MOVED exception when switching from follower to a leader.
 -

 Key: ZOOKEEPER-484
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-484
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.2.0
Reporter: Mahadev konar
Assignee: Mahadev konar
Priority: Blocker
 Fix For: 3.2.1, 3.3.0

 Attachments: sessionTest.patch, ZOOKEEPER-484.patch


 When a client is connected to follower and get disconnected and connects to a 
 leader it gets SESSION MOVED excpetion. This is beacuse of a bug in the new 
 feature of ZOOKEEPER-417 that we added in 3.2. All the releases before 3.2 DO 
 NOT have this problem. The fix is to make sure the ownership of a connection 
 gets changed when a session moves from follower to the leader. The workaround 
 to it in 3.2.0 would be to swithc off connection from clients to the leader. 
 take a look at *leaderServers* java property in 
 http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperAdmin.html.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

hudson patch build back to normal

Sendmail issues on hudson.zones is fixed now and patch build for zookeeper is 
restarted.

Regards,
Giri

RE: hudson patch build back to normal

If you have changed the jira status to patch available in the last couple of 
days please resubmit your patch for hudson to pick your patch for testing.
-Giri

 -Original Message-
 From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com]
 Sent: Wednesday, August 05, 2009 7:18 PM
 To: zookeeper-dev@hadoop.apache.org
 Cc: Nigel Daley
 Subject: hudson patch build back to normal
 
 Sendmail issues on hudson.zones is fixed now and patch build for
 zookeeper is restarted.
 
 Regards,
 Giri

Re: hudson patch build back to normal

2009-08-05 Thread Patrick Hunt


Thanks Giri!

Patrick

Giridharan Kesavan wrote:

If you have changed the jira status to patch available in the last couple of 
days please resubmit your patch for hudson to pick your patch for testing.
-Giri


-Original Message-
From: Giridharan Kesavan [mailto:gkesa...@yahoo-inc.com]
Sent: Wednesday, August 05, 2009 7:18 PM
To: zookeeper-dev@hadoop.apache.org
Cc: Nigel Daley
Subject: hudson patch build back to normal

Sendmail issues on hudson.zones is fixed now and patch build for
zookeeper is restarted.

Regards,
Giri

[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration

[
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739609#action_12739609
]

Patrick Hunt commented on ZOOKEEPER-498:

Looks to me like 0 weight is still busted, fle0weighttest is actually failing
on my machine, however it's reported as success:
- Standard Error -
Exception in thread Thread-108 junit.framework.AssertionFailedError: Elected
zero-weight server
at junit.framework.Assert.fail(Assert.java:47)
at
org.apache.zookeeper.test.FLEZeroWeightTest$LEThread.run(FLEZeroWeightTest.java:138)
- ---

this is probably due because the test is calling assert in a thread other than
the main test thread - which junit will not track/knowabout.

One problem I see with these tests (0weight test I looked at) -- it doesn't
have a client attempt to connect to the various servers as part of declaring
success. Really we should only consider successful test (ie assert that) if a
client can connect to each server in the cluster and change/seechanges. As part
of fixing this we really need to do a sanity check by testing the various
command lines and checking that a client can connect.

I'm not even sure FLEnewepochtest/fletest/etc... are passing either. new epoch
seems to just thrash...

Also I tried 3 5 server quorums by hand from the command line with 0 weight
and they see similar issues to what Todd is seeing.

this is happening for me on both the trunk and 3.2 branch source.

Unending Leader Elections : WAN configuration
-

Key: ZOOKEEPER-498
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
Project: Zookeeper
Issue Type: Bug
Components: leaderElection
Affects Versions: 3.2.0
Environment: Each machine:
CentOS 5.2 64-bit
2GB ram
java version 1.6.0_13
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed
Network Topology:
DC : central data center
POD(N): remote data center
Zookeeper Topology:
Leaders may be elected only in DC (weight = 1)
Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Patrick Hunt
Priority: Critical
Fix For: 3.2.1, 3.3.0

Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, zoo.cfg

In a WAN configuration, ZooKeeper is endlessly electing, terminating, and
re-electing a ZooKeeper leader. The WAN configuration involves two groups, a
central DC group of ZK servers that have a voting weight = 1, and a group of
servers in remote pods with a voting weight of 0.
What we expect to see is leaders elected only in the DC, and the pods to
contain only followers. What we are seeing is a continuous cycling of
leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended
patches (473, 479, 481, 491), and now release 3.2.1.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-498:
---

Attachment: zk498-test.tar.gz

I attached zk498-test.tar.gz - this is a 5 server config (2 0weight) that fails 
to achieve quorum.

run start.sh/stop.sh and checkout the individual logs for details.



 Unending Leader Elections : WAN configuration
 -

 Key: ZOOKEEPER-498
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.2.0
 Environment: Each machine:
 CentOS 5.2 64-bit
 2GB ram
 java version 1.6.0_13
 Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed 
 Network Topology:
 DC : central data center
 POD(N): remote data center
 Zookeeper Topology:
 Leaders may be elected only in DC (weight = 1)
 Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Patrick Hunt
Priority: Critical
 Fix For: 3.2.1, 3.3.0

 Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, 
 zk498-test.tar.gz, zoo.cfg


 In a WAN configuration, ZooKeeper is endlessly electing, terminating, and 
 re-electing a ZooKeeper leader. The WAN configuration involves two groups, a 
 central DC group of ZK servers that have a voting weight = 1, and a group of 
 servers in remote pods with a voting weight of 0.
 What we expect to see is leaders elected only in the DC, and the pods to 
 contain only followers. What we are seeing is a continuous cycling of 
 leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended 
 patches (473, 479, 481, 491), and now release 3.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt reassigned ZOOKEEPER-499:
--

Assignee: Patrick Hunt

 electionAlg should default to FLE (3) - regression
 --

 Key: ZOOKEEPER-499
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499
 Project: Zookeeper
  Issue Type: Bug
  Components: server, tests
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Blocker
 Fix For: 3.2.1, 3.3.0


 there's a regression in 3.2 - electionAlg is no longer defaulting to 3 
 (incorrectly defaults to 0)
 also - need to have tests to validate this

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-462) Last hint for open ledger

2009-08-05 Thread Mahadev konar (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-462:


Fix Version/s: 3.3.0

 Last hint for open ledger
 -

 Key: ZOOKEEPER-462
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-462
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib-bookkeeper
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-462.patch


 In some use cases of BookKeeper, it is useful to be able to read from a 
 ledger before closing the ledger. To enable such a feature, the writer has to 
 be able to communicate to a reader how many entries it has been able to write 
 successfully. The main idea of this jira is to continuously update a znode 
 with the number of successful writes, and a reader can, for example, watch 
 the node for changes.
  I was thinking of having a configuration parameter to state how often a 
 writer should update the hint on ZooKeeper (e.g., every 1000 requests, every 
 10,000 requests). Clearly updating more often increases the overhead of 
 writing to ZooKeeper, although the impact on the performance of writes to 
 BookKeeper should be minimal given that we make an asynchronous call to 
 update the hint.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression

electionAlg should default to FLE (3) - regression
--

 Key: ZOOKEEPER-499
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499
 Project: Zookeeper
  Issue Type: Bug
  Components: server, tests
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Priority: Blocker
 Fix For: 3.2.1, 3.3.0


there's a regression in 3.2 - electionAlg is no longer defaulting to 3 
(incorrectly defaults to 0)

also - need to have tests to validate this

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-499:
---

Release Note: 
workaround in 3.2.0 (this only effects 3.2.0)

set electionAlg=3 in server config files.

 electionAlg should default to FLE (3) - regression
 --

 Key: ZOOKEEPER-499
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499
 Project: Zookeeper
  Issue Type: Bug
  Components: server, tests
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Blocker
 Fix For: 3.2.1, 3.3.0


 there's a regression in 3.2 - electionAlg is no longer defaulting to 3 
 (incorrectly defaults to 0)
 also - need to have tests to validate this

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Optimized WAN ZooKeeper Config : Multi-Ensemble configuration

2009-08-05 Thread Mahadev Konar

Todd,
 Comments in line:


On 8/5/09 12:10 PM, Todd Greenwood to...@audiencescience.com wrote:

 Flavio/Patrick/Mahadev -
 
 Thanks for your support to date. As I understand it, the sticky points
 w/ respect to WAN deployments are:
 
 1. Leader Election:
 
 Leader elections in the WAN config (pod zk server weight = 0) is a bit
 troublesome (ZOOKEEPER-498)
Yes, until ZOOKEEPER-498 is fixed, you wont be able to use it with groups
and zero weight.

 
 2. Network Connectivity Required:
 
 ZooKeeper clients cannot read/write to ZK Servers if the Server does not
 have network connectivity to the quorum. In short, there is a hard
 requirement to have network connectivity in order for the clients to
 access the shared memory graph in ZK.
Yes

 
 Alternative
 ---
 
 I have seen some discussion about in the past re: multi-ensemble
 solutions. Essentially, put one ensemble in each physical location
 (POD), and another in your DC, and have a fairly simple process
 coordinate synchronizing the various ensembles. If the POD writes can be
 confined to a sub-tree in the master graph, then this should be fairly
 simple. I'm imagining the following:
 
 DC (master) graph:
 /root/pods/1/data/item1
 /root/pods/1/data/item2
 /root/pods/1/data/item3
 /root/pods/2
 /root/pods/3
 ...etc
 /root/shared/allpods/readonly/data/item1
 /root/shared/allpods/readonly/data/item2
 ...etc
 
 This has the advantage of minimizing cross pod traffic, which could be a
 real perf killer in an WAN. It also provides transacted writes in the
 PODs, even in the disconnected state. Clearly, another portion of the
 business logic has to reconcile the DC (master) graph such that each of
 the pods data items are processed, etc.
 
 Does anyone have any experience with this (pitfalls, suggestions, etc.?)
As far as I understand is that you mean that have a master Cluster with
other in a different data center syncing with the master (just a subtree)?
Is that correct? 

If yes, this is what one of our users in Yahoo! Search do. They have a
master cluster and a smaller cluster in a different datacenter and a brdige
that copies data from the master cluster (only a subtree) to the smaller one
and keeps them in syncs.


Thanks
mahadev
 
 -Todd

RE: Optimized WAN ZooKeeper Config : Multi-Ensemble configuration

2009-08-05 Thread Todd Greenwood

Mahadev, comments inline:

 -Original Message-
 From: Mahadev Konar [mailto:maha...@yahoo-inc.com]
 Sent: Wednesday, August 05, 2009 1:47 PM
 To: zookeeper-dev@hadoop.apache.org
 Subject: Re: Optimized WAN ZooKeeper Config : Multi-Ensemble
configuration

 Todd,
  Comments in line:

 On 8/5/09 12:10 PM, Todd Greenwood to...@audiencescience.com
wrote:

  Flavio/Patrick/Mahadev -

  Thanks for your support to date. As I understand it, the sticky
points
  w/ respect to WAN deployments are:

  1. Leader Election:

  Leader elections in the WAN config (pod zk server weight = 0) is a
bit
  troublesome (ZOOKEEPER-498)
 Yes, until ZOOKEEPER-498 is fixed, you wont be able to use it with
groups
 and zero weight.

  2. Network Connectivity Required:

  ZooKeeper clients cannot read/write to ZK Servers if the Server does
not
  have network connectivity to the quorum. In short, there is a hard
  requirement to have network connectivity in order for the clients to
  access the shared memory graph in ZK.
 Yes

  Alternative
  ---

  I have seen some discussion about in the past re: multi-ensemble
  solutions. Essentially, put one ensemble in each physical location
  (POD), and another in your DC, and have a fairly simple process
  coordinate synchronizing the various ensembles. If the POD writes
can be
  confined to a sub-tree in the master graph, then this should be
fairly
  simple. I'm imagining the following:

  DC (master) graph:
  /root/pods/1/data/item1
  /root/pods/1/data/item2
  /root/pods/1/data/item3
  /root/pods/2
  /root/pods/3
  ...etc
  /root/shared/allpods/readonly/data/item1
  /root/shared/allpods/readonly/data/item2
  ...etc

  This has the advantage of minimizing cross pod traffic, which could
be a
  real perf killer in an WAN. It also provides transacted writes in
the
  PODs, even in the disconnected state. Clearly, another portion of
the
  business logic has to reconcile the DC (master) graph such that each
of
  the pods data items are processed, etc.

  Does anyone have any experience with this (pitfalls, suggestions,
etc.?)
 As far as I understand is that you mean that have a master Cluster
with
 other in a different data center syncing with the master (just a
subtree)?
 Is that correct?

 If yes, this is what one of our users in Yahoo! Search do. They have a
 master cluster and a smaller cluster in a different datacenter and a
 brdige
 that copies data from the master cluster (only a subtree) to the
smaller
 one
 and keeps them in syncs.

Yes, this is exactly what I'm proposing. With the addition that I'll
sync subtrees in both directions, and have a separate process reconcile
data from the various pods, like so:

#pod1 ensemble
/root/a/b

#pod2 ensemble
/root/a/b

#dc ensemble
/root/shared/foo/bar

# Mapping (modeled after perforce client config)
# [ensemble]:[path] [ensemble]:[path]
# sync pods to dc
[POD1]:/root/... [DC]:/root/pods/POD1/...
[POD2]:/root/... [DC]:/root/pods/POD2/...
# sync dc to pods
[DC]:/root/shared/... [POD1]:/shared/...
[DC]:/root/shared/... [POD2]:/shared/...
[DC]:/root/shared/... [POD3]:/shared/...

Now, for our needs, we'd like the DC data aggregated, so I'll have
another process handle aggregating the pod specific data like so:

POD Data Aggregator: aggregate data in [DC]:/root/pods/POD(N) to
[DC]:/root/aggregated/data.

This is just off the top of my head.

-Todd

 Thanks
 mahadev

  -Todd

[jira] Updated: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-499:
---

Attachment: ZOOKEEPER-499_br3.2.patch
ZOOKEEPER-499.patch

patches to fix on trunk and branch (br3.2 is the branch patch)

this fixes the problem - electionAlg again defaults to 3
it also adds a test to verify fle is used by default
it also fixes a test that fails if fle is used (vs algo 0) which is due to a 
difference in the way jdk exposes
  unresolved host names when using udp vs tcp.


 electionAlg should default to FLE (3) - regression
 --

 Key: ZOOKEEPER-499
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499
 Project: Zookeeper
  Issue Type: Bug
  Components: server, tests
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Blocker
 Fix For: 3.2.1, 3.3.0

 Attachments: ZOOKEEPER-499.patch, ZOOKEEPER-499_br3.2.patch


 there's a regression in 3.2 - electionAlg is no longer defaulting to 3 
 (incorrectly defaults to 0)
 also - need to have tests to validate this

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-499) electionAlg should default to FLE (3) - regression


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-499:
---

Status: Patch Available  (was: Open)

 electionAlg should default to FLE (3) - regression
 --

 Key: ZOOKEEPER-499
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-499
 Project: Zookeeper
  Issue Type: Bug
  Components: server, tests
Affects Versions: 3.2.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Blocker
 Fix For: 3.2.1, 3.3.0

 Attachments: ZOOKEEPER-499.patch, ZOOKEEPER-499_br3.2.patch


 there's a regression in 3.2 - electionAlg is no longer defaulting to 3 
 (incorrectly defaults to 0)
 also - need to have tests to validate this

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt reassigned ZOOKEEPER-498:
--

Assignee: Flavio Paiva Junqueira  (was: Patrick Hunt)

 Unending Leader Elections : WAN configuration
 -

 Key: ZOOKEEPER-498
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.2.0
 Environment: Each machine:
 CentOS 5.2 64-bit
 2GB ram
 java version 1.6.0_13
 Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed 
 Network Topology:
 DC : central data center
 POD(N): remote data center
 Zookeeper Topology:
 Leaders may be elected only in DC (weight = 1)
 Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Flavio Paiva Junqueira
Priority: Critical
 Fix For: 3.2.1, 3.3.0

 Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, 
 zk498-test.tar.gz, zoo.cfg


 In a WAN configuration, ZooKeeper is endlessly electing, terminating, and 
 re-electing a ZooKeeper leader. The WAN configuration involves two groups, a 
 central DC group of ZK servers that have a voting weight = 1, and a group of 
 servers in remote pods with a voting weight of 0.
 What we expect to see is leaders elected only in the DC, and the pods to 
 contain only followers. What we are seeing is a continuous cycling of 
 leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended 
 patches (473, 479, 481, 491), and now release 3.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739787#action_12739787
 ] 

Patrick Hunt commented on ZOOKEEPER-498:


Please fix the following as well - incorrect logging levels are being used in 
quorum code, example:

2009-08-05 15:17:02,733 - ERROR [WorkerSender Thread:quorumcnxmana...@341] - 
There is a connection for server 1
2009-08-05 15:17:02,753 - ERROR [WorkerSender Thread:quorumcnxmana...@341] - 
There is a connection for server 2

this is INFO, not ERROR


 Unending Leader Elections : WAN configuration
 -

 Key: ZOOKEEPER-498
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.2.0
 Environment: Each machine:
 CentOS 5.2 64-bit
 2GB ram
 java version 1.6.0_13
 Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed 
 Network Topology:
 DC : central data center
 POD(N): remote data center
 Zookeeper Topology:
 Leaders may be elected only in DC (weight = 1)
 Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Flavio Paiva Junqueira
Priority: Critical
 Fix For: 3.2.1, 3.3.0

 Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, 
 zk498-test.tar.gz, zoo.cfg


 In a WAN configuration, ZooKeeper is endlessly electing, terminating, and 
 re-electing a ZooKeeper leader. The WAN configuration involves two groups, a 
 central DC group of ZK servers that have a voting weight = 1, and a group of 
 servers in remote pods with a voting weight of 0.
 What we expect to see is leaders elected only in the DC, and the pods to 
 contain only followers. What we are seeing is a continuous cycling of 
 leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended 
 patches (473, 479, 481, 491), and now release 3.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739789#action_12739789
 ] 

Patrick Hunt commented on ZOOKEEPER-498:


Todd,I did see an issue with your config, it's not:

group.1:1:2:3

rather it's:

group.1=1:2:3

(should be = not : )


Regardless though - even after I fix this it's still not forming a cluster 
properly, we're still looking.


 Unending Leader Elections : WAN configuration
 -

 Key: ZOOKEEPER-498
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.2.0
 Environment: Each machine:
 CentOS 5.2 64-bit
 2GB ram
 java version 1.6.0_13
 Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed 
 Network Topology:
 DC : central data center
 POD(N): remote data center
 Zookeeper Topology:
 Leaders may be elected only in DC (weight = 1)
 Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Flavio Paiva Junqueira
Priority: Critical
 Fix For: 3.2.1, 3.3.0

 Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, 
 zk498-test.tar.gz, zoo.cfg


 In a WAN configuration, ZooKeeper is endlessly electing, terminating, and 
 re-electing a ZooKeeper leader. The WAN configuration involves two groups, a 
 central DC group of ZK servers that have a voting weight = 1, and a group of 
 servers in remote pods with a voting weight of 0.
 What we expect to see is leaders elected only in the DC, and the pods to 
 contain only followers. What we are seeing is a continuous cycling of 
 leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended 
 patches (473, 479, 481, 491), and now release 3.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (ZOOKEEPER-500) Async methods shouldnt throw exceptions

2009-08-05 Thread Utkarsh Srivastava (JIRA)

Async methods shouldnt throw exceptions
---

 Key: ZOOKEEPER-500
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-500
 Project: Zookeeper
  Issue Type: Improvement
  Components: contrib-bookkeeper
Reporter: Utkarsh Srivastava


Async methods like asyncLedgerCreate and Open shouldnt be throwing 
InterruptedException and BKExceptions. 

The present method signatures lead to messy application code since one is 
forced to have error handling code in 2 places: inside the callback to handler 
a non-OK return code, and outside for handling the exceptions thrown by the 
call. 

There should be only one way to indicate error conditions, and that should be 
through a non-ok return code to the callback.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-490) the java docs for session creation are misleading/incomplete


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-490:
---

Attachment: ZOOKEEPER-490.patch

this patch updates the javadoc for zk construction
talks about async nature
talks about thread safety


 the java docs for session creation are misleading/incomplete
 

 Key: ZOOKEEPER-490
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-490
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.1.1, 3.2.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 3.2.1, 3.3.0

 Attachments: ZOOKEEPER-490.patch


 the javadoc for ZooKeeper constructor says:
  * The client object will pick an arbitrary server and try to connect to 
 it.
  * If failed, it will try the next one in the list, until a connection is
  * established, or all the servers have been tried.
 the or all server tried phrase is misleading, it should indicate that we 
 retry until success, con closed, or session expired. 
 we also need ot mention that connection is async, that constructor returns 
 immed and you need to look for connection event in watcher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-490) the java docs for session creation are misleading/incomplete


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-490:
---

Status: Patch Available  (was: Open)

 the java docs for session creation are misleading/incomplete
 

 Key: ZOOKEEPER-490
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-490
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.2.0, 3.1.1
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 3.2.1, 3.3.0

 Attachments: ZOOKEEPER-490.patch


 the javadoc for ZooKeeper constructor says:
  * The client object will pick an arbitrary server and try to connect to 
 it.
  * If failed, it will try the next one in the list, until a connection is
  * established, or all the servers have been tried.
 the or all server tried phrase is misleading, it should indicate that we 
 retry until success, con closed, or session expired. 
 we also need ot mention that connection is async, that constructor returns 
 immed and you need to look for connection event in watcher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration

2009-08-05 Thread Flavio Paiva Junqueira (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739891#action_12739891
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-498:
--

Pat, we have a description of how to configure in the Cluster options of the 
Administrator guide. We are missing an example, which is in the source code as 
you point out.

 Unending Leader Elections : WAN configuration
 -

 Key: ZOOKEEPER-498
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.2.0
 Environment: Each machine:
 CentOS 5.2 64-bit
 2GB ram
 java version 1.6.0_13
 Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed 
 Network Topology:
 DC : central data center
 POD(N): remote data center
 Zookeeper Topology:
 Leaders may be elected only in DC (weight = 1)
 Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Flavio Paiva Junqueira
Priority: Critical
 Fix For: 3.2.1, 3.3.0

 Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, 
 zk498-test.tar.gz, zoo.cfg


 In a WAN configuration, ZooKeeper is endlessly electing, terminating, and 
 re-electing a ZooKeeper leader. The WAN configuration involves two groups, a 
 central DC group of ZK servers that have a voting weight = 1, and a group of 
 servers in remote pods with a voting weight of 0.
 What we expect to see is leaders elected only in the DC, and the pods to 
 contain only followers. What we are seeing is a continuous cycling of 
 leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended 
 patches (473, 479, 481, 491), and now release 3.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration

2009-08-05 Thread Flavio Paiva Junqueira (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Flavio Paiva Junqueira updated ZOOKEEPER-498:
-

Attachment: ZOOKEEPER-498.patch

I have generated a patch for this issue. I verified that I didn't do the
correct checks in ZOOKEEPER-491, so I try to fix it in this patch. I have also
modified the test to fix the problem with the fail assertion, and I have
inspected the logs to see if it is behaving as expected. I can see no problem
at this time with this patch.

If someone else is interested in checking it out, please do it.

Unending Leader Elections : WAN configuration
-

Key: ZOOKEEPER-498
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
Project: Zookeeper
Issue Type: Bug
Components: leaderElection
Affects Versions: 3.2.0
Environment: Each machine:
CentOS 5.2 64-bit
2GB ram
java version 1.6.0_13
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed
Network Topology:
DC : central data center
POD(N): remote data center
Zookeeper Topology:
Leaders may be elected only in DC (weight = 1)
Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Flavio Paiva Junqueira
Priority: Critical
Fix For: 3.2.1, 3.3.0

Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz,
zk498-test.tar.gz, zoo.cfg, ZOOKEEPER-498.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration

2009-08-05 Thread Flavio Paiva Junqueira (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-498:
-

Status: Patch Available  (was: Open)

 Unending Leader Elections : WAN configuration
 -

 Key: ZOOKEEPER-498
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.2.0
 Environment: Each machine:
 CentOS 5.2 64-bit
 2GB ram
 java version 1.6.0_13
 Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed 
 Network Topology:
 DC : central data center
 POD(N): remote data center
 Zookeeper Topology:
 Leaders may be elected only in DC (weight = 1)
 Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Flavio Paiva Junqueira
Priority: Critical
 Fix For: 3.2.1, 3.3.0

 Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, 
 zk498-test.tar.gz, zoo.cfg, ZOOKEEPER-498.patch


 In a WAN configuration, ZooKeeper is endlessly electing, terminating, and 
 re-electing a ZooKeeper leader. The WAN configuration involves two groups, a 
 central DC group of ZK servers that have a voting weight = 1, and a group of 
 servers in remote pods with a voting weight of 0.
 What we expect to see is leaders elected only in the DC, and the pods to 
 contain only followers. What we are seeing is a continuous cycling of 
 leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended 
 patches (473, 479, 481, 491), and now release 3.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

build failures on hudson zones

Build on hudson.zones are failing as the zonestorage for hudson is full.
I 've sent an email to the ASF infra team about the space issues on hudson 
zones.

Once the issues is resolved I would restart hudson for builds.

Thanks,
Giri

BUILDS ARE BACK NORMAL