[jira] Commented: (ZOOKEEPER-759) Stop accepting connections when close to file descriptor limit

2010-04-29 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12862303#action_12862303
 ] 

Ted Dunning commented on ZOOKEEPER-759:
---


This is a unix specific bean so don't forget to defang the test if the bean 
isn't available.



 Stop accepting connections when close to file descriptor limit
 --

 Key: ZOOKEEPER-759
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-759
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Reporter: Travis Crawford

 Zookeeper always tries to accept new connections, throwing an exception if 
 out of file descriptors. An improvement would be denying new client 
 connections when close to the limit.
 Additionally, file-descriptor limits+usage should be exported to the 
 monitoring four-letter word, should that get implemented (see ZOOKEEPER-744).
 DETAILS
 A Zookeeper ensemble I administer recently suffered an outage when one node 
 was restarted with the low system-default ulimit of 1024 file descriptors and 
 later ran out. File descriptor usage+max are already being monitored by the 
 following MBeans:
 - java.lang.OperatingSystem.MaxFileDescriptorCount
 - java.lang.OperatingSystem.OpenFileDescriptorCount
 They're described (rather tersely) at:
 http://java.sun.com/javase/6/docs/jre/api/management/extension/com/sun/management/UnixOperatingSystemMXBean.html
 This feature request is for the following:
 (a) Stop accepting new connections when OpenFileDescriptorCount is close to 
 MaxFileDescriptorCount, defaulting to 95% FD usage. New connections should be 
 denied, logged to disk at debug level, and increment a 
 ``ConnectionDeniedCount`` MBean counter.
 (b) Begin accepting new connections when usage drops below some configurable 
 threshold, defaulting to 90% of FD usage, basically the high/low watermark 
 model.
 (c) Update the administrators guide with a comment about using an appropriate 
 FD limit.
 (d) Extra credit: if ZOOKEEPER-744 is implemented export statistics for:
 zookeeper_open_file_descriptor_count
 zookeeper_max_file_descriptor_count
 zookeeper_max_file_descriptor_mismatch - boolean, exported by leader, if not 
 all zk's have the same max FD value

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-22) Automatic request retries on connect failover

2009-10-30 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772020#action_12772020
 ] 

Ted Dunning commented on ZOOKEEPER-22:
--


Is there progress on this issue?

 Automatic request retries on connect failover
 -

 Key: ZOOKEEPER-22
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-22
 Project: Zookeeper
  Issue Type: New Feature
  Components: c client, java client
Reporter: Patrick Hunt
Assignee: Mahadev konar
 Fix For: 3.3.0


 Moved from SourceForge to Apache.
 http://sourceforge.net/tracker/index.php?func=detailaid=1831412group_id=209147atid=1008547
 When a connection to a ZooKeeper server fails, all of the pending requests
 will return an error. In reality the requests should be resubmitted when
 the client reestablishes a connection to ZooKeeper.
 For read requests, it's no big deal to just reissue the request. For update
 requests, the ZooKeeper must be able to detect if the request has been
 processed and, if so, return the result of the previous execution;
 otherwise, it should process the request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-22) Automatic request retries on connect failover

2009-10-30 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772165#action_12772165
 ] 

Ted Dunning commented on ZOOKEEPER-22:
--


I wouldn't call it laziness.  At most distraction.

But a lot of ZK users will breathe a sigh of relief when this fix gets deployed!

Thanks for your efforts on this.

 Automatic request retries on connect failover
 -

 Key: ZOOKEEPER-22
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-22
 Project: Zookeeper
  Issue Type: New Feature
  Components: c client, java client
Reporter: Patrick Hunt
Assignee: Mahadev konar
 Fix For: 3.3.0


 Moved from SourceForge to Apache.
 http://sourceforge.net/tracker/index.php?func=detailaid=1831412group_id=209147atid=1008547
 When a connection to a ZooKeeper server fails, all of the pending requests
 will return an error. In reality the requests should be resubmitted when
 the client reestablishes a connection to ZooKeeper.
 For read requests, it's no big deal to just reissue the request. For update
 requests, the ZooKeeper must be able to detect if the request has been
 processed and, if so, return the result of the previous execution;
 otherwise, it should process the request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-556) Startup messages should account for common error of missing leading slash in config files

2009-10-22 Thread Ted Dunning (JIRA)
Startup messages should account for common error of missing leading slash in 
config files
-

 Key: ZOOKEEPER-556
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-556
 Project: Zookeeper
  Issue Type: Bug
Reporter: Ted Dunning


It would be nice if the startup noticed directories without a leading slash in 
the config file.  That is worth a warning.

Moreover, if that directory exists looking from root, but not looking from the 
current directory, a very serious warning is in order.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-556) Startup messages should account for common error of missing leading slash in config files

2009-10-22 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768937#action_12768937
 ] 

Ted Dunning commented on ZOOKEEPER-556:
---

In the following exchange, a new Zookeeper user lost several days wrestling 
with this issue.  Henry spotted the problem.  I didn't.  Patrick didn't.

With a prominent error message, the user would have found this in 5 minutes.

{noformat}
yeah - thought this was it: you've missed the forward slash on
home/mark/zookeeper (this turned up on your exception message).

On Thu, Oct 22, 2009 at 2:55 PM, Mark Vigeant
mark.vige...@riskmetrics.comwrote:

 Yeah I just figured out the problem with zoocfg.py

 I am running as the same user who created myid. Here's my config:

 zoo.cfg

 tickTime-2000
 dataDir=home/mark/zookeeper
 clientPort=2181
 initLimit=5
 syncLimit=2
 server.1= hermes:2888:3888
 server.2= leela:2888:3888

 on the machines hermes and leela I've put myid files in
 /home/mark/zookeeper
 with the numbers 1 and 2 respectively
{noformat}


 Startup messages should account for common error of missing leading slash in 
 config files
 -

 Key: ZOOKEEPER-556
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-556
 Project: Zookeeper
  Issue Type: Bug
Reporter: Ted Dunning

 It would be nice if the startup noticed directories without a leading slash 
 in the config file.  That is worth a warning.
 Moreover, if that directory exists looking from root, but not looking from 
 the current directory, a very serious warning is in order.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: feedback zkclient

2009-10-01 Thread Ted Dunning
I think that another way to say this is that zkClient is going a bit for the
Spring philosophy that if the caller can't (or won't) be handling the
situation, then they shouldn't be forced to declare it.  The Spring
jdbcTemplate is a grand example of the benefits of this.

First implementations of this policy generally are a bit too broad, though,
so this should be examined carefully.

On Thu, Oct 1, 2009 at 8:05 AM, Peter Voss i...@petervoss.org wrote:

 5) there's a lot of wrapping of exceptions, looks like this is done in
 order to make them unchecked. Is this wise? How much simpler does it
 really make things? Esp things like interrupted exception? As you mentioned,
 one of your intents is to simplify things, but perhaps too simple? Some
 short, clear examples of usage would be helpful here to compare/contrast, I
 took a very quick look at some of the tests but that didn't help much. Is
 there a test(s) in particular that I should look at to see how zkclient is
 used, and the benefits incurred?


 Checked exceptions are very painful when you are assembling together a
 larger number of libraries (which is true for most enterprise applications).
 Either you wind up having a general throws Exception (which I don't really
 like, because it's too general) at most of your interfaces, or you have to
 wrap checked exceptions into runtime exceptions.

 We didn't want a library to introduce yet another checked exception that
 you MUST catch or rethrow. I know that there are different opinions about
 that, but that's the idea behind this.

 Similar situation for the InterruptedException. ZkClient also converts this
 to a runtime exception and makes sure that the interrupted flag doesn't get
 cleared. There are just too many existing libraries that have a catch
 (Exception e) somewhere that totally ignores that this would reset the
 interrupt flag, if e is an InterruptedException. Therefore we better avoid
 having all of the methods throwing that exception.




-- 
Ted Dunning, CTO
DeepDyve


Re: feedback zkclient

2009-10-01 Thread Ted Dunning
There is not much way to totally avoid this without massive performance loss
because the connection loss could be during the the time that the
confirmation is returning.

You may be able to tell if the file is yours be examining the content and
ownership, but this is pretty implementation dependent.  In particular, it
makes queues very difficult to implement correctly.  If this happens during
the creation of an ephemeral file, the only option may be to close the
connection (thus deleting all ephemeral files) and start over.

On Thu, Oct 1, 2009 at 8:05 AM, Peter Voss i...@petervoss.org wrote:

 3) there's definitely an issue in the retryUntilConnected logic that you
 need to address

 let's say you call zkclient.create, and the connection to the server is
 lost while the request is in flight. At this point ConnectionLoss is thrown
 on the client side, however you (client) have no information on whether the
 server has made the change or not. The retry method's while loop will re-run
 the create (after reconnect), and the result seen by the caller (user code)
 could be either OK or may be NODEEXISTS exception, there's no way to know
 which.

 Mahadev is working on ZOOKEEPER-22 which will address this issue, but
 that's a future version, not today.


 Good catch. I wasn't aware that nodes could still be have been created when
 receiving a ConnectionLoss. But how would you deal with that?
 If we create a znode and get a ConnectionLoss exception, then wait until
 the connection is back and check if the znode is there. There is no way of
 knowing whether it was us who created the node or somebody else, right?




-- 
Ted Dunning, CTO
DeepDyve


Re: feedback zkclient

2009-10-01 Thread Ted Dunning
That looks really lovely.

Judging by history and that fact that only 40/127 issues are resolved, 3.3
is probably 3-6 months away.  Is that a fair assessment?

On Thu, Oct 1, 2009 at 11:13 AM, Patrick Hunt ph...@apache.org wrote:

 One nice thing about ephemeral is that the Stat contains the owner
 sessionid. As you say, it's highly implementation dependent. It's also
 something we recognize is a problem for users, we've slated it for 3.3.0
 http://issues.apache.org/jira/browse/ZOOKEEPER-22




-- 
Ted Dunning, CTO
DeepDyve


Re: Show your ZooKeeper pride!

2009-06-08 Thread Ted Dunning
 How come Yahoo isn't listed?

On Mon, Jun 8, 2009 at 6:31 PM, Patrick Hunt ph...@apache.org wrote:

 The Hadoop summit is Wednesday. If you're attending please feel free to say
 hi -- Mahadev is presenting @4, Ben and I will be attending as well.

 Also, regardless of whether you're attending or not we'd appreciate any
 updates to the powered by page, if you're too busy to update it yourself
 send us a snippet and we'll update it for you ;-)

 http://wiki.apache.org/hadoop/ZooKeeper/PoweredBy

 Regards,

 Patrick




-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)


[jira] Created: (ZOOKEEPER-418) Need nifty zookeeper browser

2009-05-27 Thread Ted Dunning (JIRA)
Need nifty zookeeper browser


 Key: ZOOKEEPER-418
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-418
 Project: Zookeeper
  Issue Type: Bug
Reporter: Ted Dunning


It would be very nice to have a browser that would allow the state of a Zoo to 
be examined.  Even nice would be such a utility that showed changes in real 
time.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-418) Need nifty zookeeper browser

2009-05-27 Thread Ted Dunning (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Dunning updated ZOOKEEPER-418:
--

Attachment: zk-view-0.1.tgz

Here is a first stab at recreating our internal tool with nice upgrades like 
real-time updates for file and directory contents.  I have never built any 
swing UI's before so there are bound to be infelicities galore.  Please help.

There are some warts,  

1) you can't open a file that has children.

2) opening non-text files is bad juju

3) There seems to be a problem with the way the watchers are glued in place.  
If you create a file, it appears, but if you create children for it, it doesn't 
turn into a folder.  Work-around is to simply restart the browser.



 Need nifty zookeeper browser
 

 Key: ZOOKEEPER-418
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-418
 Project: Zookeeper
  Issue Type: Bug
Reporter: Ted Dunning
 Attachments: zk-view-0.1.tgz


 It would be very nice to have a browser that would allow the state of a Zoo 
 to be examined.  Even nice would be such a utility that showed changes in 
 real time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-418) Need nifty zookeeper browser

2009-05-27 Thread Ted Dunning (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Dunning updated ZOOKEEPER-418:
--

Attachment: screenshot-1.jpg

Here is a simple example on a live ZK.

 Need nifty zookeeper browser
 

 Key: ZOOKEEPER-418
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-418
 Project: Zookeeper
  Issue Type: Bug
Reporter: Ted Dunning
 Attachments: screenshot-1.jpg, zk-view-0.1.tgz


 It would be very nice to have a browser that would allow the state of a Zoo 
 to be examined.  Even nice would be such a utility that showed changes in 
 real time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.