[jira] Commented: (ZOOKEEPER-836) hostlist as string

2010-11-23 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934946#action_12934946
 ] 

Benjamin Reed commented on ZOOKEEPER-836:
-

it seems like overkill to have a class to just parse a hostlist. wouldn't you 
want put that parsing in the class that actually manages the list?

we should not be passing around a list of resolved addresses, since those 
addresses and the list themselves can change. (this is what i mentioned 
earlier.) instead hostset should take care of resolving and managing the list 
of resolved addresses. i guess we can do that as a separate patch.





> hostlist as string
> --
>
> Key: ZOOKEEPER-836
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-836
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Patrick Datko
>Assignee: Thomas Koch
> Attachments: ZOOKEEPER-836.patch
>
>
> The hostlist is parsed in the ctor of ClientCnxn. This violates the rule of 
> not doing (too much) work in a ctor. Instead the ClientCnxn should receive an 
> object of class "HostSet". HostSet could then be instantiated e.g. with a 
> comma separated string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-849) Provide Path class

2010-11-23 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934929#action_12934929
 ] 

Benjamin Reed commented on ZOOKEEPER-849:
-

how do i see the patches? 

> Provide Path class
> --
>
> Key: ZOOKEEPER-849
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-849
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-836) hostlist as string

2010-11-22 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934718#action_12934718
 ] 

Benjamin Reed commented on ZOOKEEPER-836:
-

why don't we at least call it HostSet so that we don't have to change the name 
later?

> hostlist as string
> --
>
> Key: ZOOKEEPER-836
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-836
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Patrick Datko
>Assignee: Thomas Koch
> Attachments: ZOOKEEPER-836.patch
>
>
> The hostlist is parsed in the ctor of ClientCnxn. This violates the rule of 
> not doing (too much) work in a ctor. Instead the ClientCnxn should receive an 
> object of class "HostSet". HostSet could then be instantiated e.g. with a 
> comma separated string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-836) hostlist as string

2010-11-22 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934559#action_12934559
 ] 

Benjamin Reed commented on ZOOKEEPER-836:
-

this looks good thomas. i was hoping to take it one step farther. having a 
ConnectStringParser and then passing around a list of InetSocketAddresses, it 
would be nice to have a HostSet and then pass that object around. We would also 
move the shuffling and the calculation of the connect timeout into that class.

the reason i mention this is we have another issue (which i cannot seem to 
find) of periodically reresolving hostnames. right now if you change address 
resolutions at the DNS server, the client will not pick it up. If we were 
passing around a HostSet object, follow on work could have a periodic 
re-resolution encapsulated in that class.

what do you think? 

> hostlist as string
> --
>
> Key: ZOOKEEPER-836
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-836
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Patrick Datko
>Assignee: Thomas Koch
> Attachments: ZOOKEEPER-836.patch
>
>
> The hostlist is parsed in the ctor of ClientCnxn. This violates the rule of 
> not doing (too much) work in a ctor. Instead the ClientCnxn should receive an 
> object of class "HostSet". HostSet could then be instantiated e.g. with a 
> comma separated string.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-922) enable faster timeout of sessions in case of unexpected socket disconnect

2010-11-18 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933557#action_12933557
 ] 

Benjamin Reed commented on ZOOKEEPER-922:
-

camille, i also think disabling moving sessions is not a good idea or very 
useful, but it seems to be the only way to have sensible semantics. 

may i suggest that we take this discussion a bit higher? i think there are 
fundamental assumptions that you are making that i'm questioning. can you write 
up a high-level design and state your assumptions? i can't quite see how the 
math works out between the client-server timeouts, connect timeouts, and lower 
session timeout. i'm also not clear on how much you are relying on a connection 
reset for the failure detection.

> enable faster timeout of sessions in case of unexpected socket disconnect
> -
>
> Key: ZOOKEEPER-922
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-922.patch
>
>
> In the case when a client connection is closed due to socket error instead of 
> the client calling close explicitly, it would be nice to enable the session 
> associated with that client to time out faster than the negotiated session 
> timeout. This would enable a zookeeper ensemble that is acting as a dynamic 
> discovery provider to remove ephemeral nodes for crashed clients quickly, 
> while allowing for a longer heartbeat-based timeout for java clients that 
> need to do long stop-the-world GC. 
> I propose doing this by setting the timeout associated with the crashed 
> session to "minSessionTimeout".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-17 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933072#action_12933072
 ] 

Benjamin Reed commented on ZOOKEEPER-925:
-

yeah i tried doxia converter with various different formats and strategies.

the problem with db2rst is, even if i get it to rst, how do i get it to 
confluence?

i looked at the search/replace, but it turns out that we do use quite a bit of 
tags that are a bit complicated, so there isn't an easy way to do it. perhaps 
it would be easy with xsl, but i don't know xsl.

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Attachments: ZOOKEEPER-925.patch
>
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes

2010-11-16 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932809#action_12932809
 ] 

Benjamin Reed commented on ZOOKEEPER-366:
-

i haven't had a chance to get back to this. we really need to convert all the 
currentTimeMillis() to nanoTime(). we need to do a similar change in the C 
client.

i don't think we can do a test for this.

> Session timeout detection can go wrong if the leader system time changes
> 
>
> Key: ZOOKEEPER-366
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Fix For: 3.3.3, 3.4.0
>
> Attachments: ZOOKEEPER-366.patch
>
>
> the leader tracks session expirations by calculating when a session will 
> timeout and then periodically checking to see what needs to be timed out 
> based on the current time. this works great as long as the leaders clock 
> progresses at a steady pace. the problem comes when there are big (session 
> size) changes in clock, by ntp for example. if time gets adjusted forward, 
> all the sessions could timeout immediately. if time goes backward sessions 
> that should timeout may take a lot longer to actually expire.
> this is really just a leader issue. the easiest way to deal with this is to 
> have the leader relinquish leadership if it detects a big jump forward in 
> time. when a new leader gets elected, it will recalculate timeouts of active 
> sessions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-16 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932799#action_12932799
 ] 

Benjamin Reed commented on ZOOKEEPER-925:
-

i cannot figure out how to convert forrest to anything. actually, i can't 
figure out how we have forrest working at all! after burning the afternoon 
trying to figure out how to convert forrest to confluence, i'm officially 
declaring defeat. it should be an easy thing to do for an xml/xsl master, but 
that is not me.

the most promising thing appears to be the doxia converter that will go from a 
bunch of formats to a bunch more formats, including from docbook or xdoc to 
confluence. unfortunately, forrest seems close to both of those, but not close 
enough...

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Attachments: ZOOKEEPER-925.patch
>
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-922) enable faster timeout of sessions in case of unexpected socket disconnect

2010-11-16 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932639#action_12932639
 ] 

Benjamin Reed commented on ZOOKEEPER-922:
-

if we had a foolproof way to tell that a client is down, we could do this fast 
expire. the methods you are proposing are not foolproof and will lead to 
problems exactly when you most want them not to.

the timeout interactions you are talking about are problematic. it's really 
hard to get them right.

one way that i can see this working is to not allow clients to reconnect to 
other servers. in that can a socket reset would indicate an expired session. is 
this acceptable to you?

> enable faster timeout of sessions in case of unexpected socket disconnect
> -
>
> Key: ZOOKEEPER-922
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-922.patch
>
>
> In the case when a client connection is closed due to socket error instead of 
> the client calling close explicitly, it would be nice to enable the session 
> associated with that client to time out faster than the negotiated session 
> timeout. This would enable a zookeeper ensemble that is acting as a dynamic 
> discovery provider to remove ephemeral nodes for crashed clients quickly, 
> while allowing for a longer heartbeat-based timeout for java clients that 
> need to do long stop-the-world GC. 
> I propose doing this by setting the timeout associated with the crashed 
> session to "minSessionTimeout".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-16 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932592#action_12932592
 ] 

Benjamin Reed commented on ZOOKEEPER-925:
-

+1 for confluence

it would be great to target 1) for when we move to tlp.

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Attachments: ZOOKEEPER-925.patch
>
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-930) Hedwig c++ client uses a non thread safe logging library

2010-11-16 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932583#action_12932583
 ] 

Benjamin Reed commented on ZOOKEEPER-930:
-

thanx ivan!

> Hedwig c++ client uses a non thread safe logging library
> 
>
> Key: ZOOKEEPER-930
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-930
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-hedwig
>Affects Versions: 3.3.2
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: ZOOKEEPER-930.patch, ZOOKEEPER-930.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-930) Hedwig c++ client uses a non thread safe logging library

2010-11-16 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-930:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed revision 1035727.


> Hedwig c++ client uses a non thread safe logging library
> 
>
> Key: ZOOKEEPER-930
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-930
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-hedwig
>Affects Versions: 3.3.2
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: ZOOKEEPER-930.patch, ZOOKEEPER-930.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-930) Hedwig c++ client uses a non thread safe logging library

2010-11-15 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932163#action_12932163
 ] 

Benjamin Reed commented on ZOOKEEPER-930:
-

looks good ivan. you should probably mention that you are moving to log4cxx for 
thread safety issues. the one minor thing: you messed up the indentation on a 
couple of lines. can you fix those?

> Hedwig c++ client uses a non thread safe logging library
> 
>
> Key: ZOOKEEPER-930
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-930
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-hedwig
>Affects Versions: 3.3.2
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: ZOOKEEPER-930.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-922) enable faster timeout of sessions in case of unexpected socket disconnect

2010-11-10 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-922:


Status: Open  (was: Patch Available)

the problem with your corner case is that you can end up with a leader who 
thinks it is still the leader, but zookeeper thinks the leader is dead and 
allows another leader to take over.

there may be a way to do this reliably, but we need to vet the design first.

> enable faster timeout of sessions in case of unexpected socket disconnect
> -
>
> Key: ZOOKEEPER-922
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-922.patch
>
>
> In the case when a client connection is closed due to socket error instead of 
> the client calling close explicitly, it would be nice to enable the session 
> associated with that client to time out faster than the negotiated session 
> timeout. This would enable a zookeeper ensemble that is acting as a dynamic 
> discovery provider to remove ephemeral nodes for crashed clients quickly, 
> while allowing for a longer heartbeat-based timeout for java clients that 
> need to do long stop-the-world GC. 
> I propose doing this by setting the timeout associated with the crashed 
> session to "minSessionTimeout".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-11-10 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-909:


Hadoop Flags: [Reviewed]

+1 looks good thomas! thanx!

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ClientCnxnSocketNetty.java, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-09 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930526#action_12930526
 ] 

Benjamin Reed commented on ZOOKEEPER-925:
-

since maven generates the doc without requiring preinstalled tools. i don't 
think it is onerous at all to just check in the sources and require users to 
compile the doc if they are using trunk.

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Attachments: ZOOKEEPER-925.patch
>
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-09 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930517#action_12930517
 ] 

Benjamin Reed commented on ZOOKEEPER-925:
-

this is pretty cool! we can generate pdfs by using doxia converter to go from 
confluence to latex.

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>Assignee: Patrick Hunt
> Attachments: ZOOKEEPER-925.patch
>
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-09 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930221#action_12930221
 ] 

Benjamin Reed commented on ZOOKEEPER-925:
-

just to be clear. we should check in the source for the docs. i'm just saying 
that we check only check in the source for the docs, not the generated pdfs and 
web pages.

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-925) Consider maven site generation to replace our forrest site and documentation generation

2010-11-09 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930205#action_12930205
 ] 

Benjamin Reed commented on ZOOKEEPER-925:
-

i'm totally interested in moving to maven site! i really really want to get 
away from forrest and make it a bit easier to write doc. can we also get away 
from checking in generated doc?

> Consider maven site generation to replace our forrest site and documentation 
> generation
> ---
>
> Key: ZOOKEEPER-925
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-925
> Project: Zookeeper
>  Issue Type: Wish
>  Components: documentation
>Reporter: Patrick Hunt
>
> See WHIRR-19 for some background.
> In whirr we looked at a number of site/doc generation facilities. In the end 
> Maven site generation plugin turned out to be by far the best option. You can 
> see our nascent site here (no attempt at styling,etc so far):
> http://incubator.apache.org/whirr/
> In particular take a look at the quick start:
> http://incubator.apache.org/whirr/quick-start-guide.html
> which was generated from
> http://svn.apache.org/repos/asf/incubator/whirr/trunk/src/site/confluence/quick-start-guide.confluence
> notice this was standard wiki markup (confluence wiki markup, same as 
> available from apache)
> You can read more about mvn site plugin here:
> http://maven.apache.org/guides/mini/guide-site.html
> Notice that other formats are available, not just confluence markup, also 
> note that you can use different markup formats if you like in the same site 
> (although probably not a great idea, but in some cases might be handy, for 
> example whirr uses the confluence wiki, so we can pretty much copy/paste 
> source docs from wiki to our site (svn) if we like)
> Re maven vs our current ant based build. It's probably a good idea for us to 
> move the build to maven at some point. We could initially move just the doc 
> generation, and then incrementally move functionality from build.xml to mvn 
> over a longer time period.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-922) enable faster timeout of sessions in case of unexpected socket disconnect

2010-11-08 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929683#action_12929683
 ] 

Benjamin Reed commented on ZOOKEEPER-922:
-

how do you deal with the following race condition:

1) the client is connected to follower1
2) the client has problems talking to follower1, so it closes the connection
3) the client connects to follower2
4) follower1 detects the closed connection and sets the connection timeout to 
min
5) the client is idle for min timeout and the leader expires the connection

the race condition is steps 3) and 4). if follower1 doesn't detect the dead 
connection fast enough, it can improperly set the timeout.

> enable faster timeout of sessions in case of unexpected socket disconnect
> -
>
> Key: ZOOKEEPER-922
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-922
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Camille Fournier
>Assignee: Camille Fournier
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-922.patch
>
>
> In the case when a client connection is closed due to socket error instead of 
> the client calling close explicitly, it would be nice to enable the session 
> associated with that client to time out faster than the negotiated session 
> timeout. This would enable a zookeeper ensemble that is acting as a dynamic 
> discovery provider to remove ephemeral nodes for crashed clients quickly, 
> while allowing for a longer heartbeat-based timeout for java clients that 
> need to do long stop-the-world GC. 
> I propose doing this by setting the timeout associated with the crashed 
> session to "minSessionTimeout".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-11-05 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-909:


Status: Open  (was: Patch Available)

once a couple of small changes are made to this patch, we should be good to go.

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-916) Problem receiving messages from subscribed channels in c++ client

2010-11-04 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-916:


Hadoop Flags: [Reviewed]

+1 thanx for the fix ivan!

> Problem receiving messages from subscribed channels in c++ client 
> --
>
> Key: ZOOKEEPER-916
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-916
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-hedwig
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: ZOOKEEPER-916.patch
>
>
> We see this bug with receiving messages from a subscribed channel.  This 
> problem seems to happen with larger messages.  The flow is to first read at 
> least 4 bytes from the socket channel. Extract the first 4 bytes to get the 
> message size.  If we've read enough data into the buffer already, we're done 
> so invoke the messageReadCallbackHandler passing the channel and message 
> size.  If not, then do an async read for at least the remaining amount of 
> bytes in the message from the socket channel.  When done, invoke the 
> messageReadCallbackHandler.
> The problem seems that when the second async read is done, the same 
> sizeReadCallbackHandler is invoked instead of the messageReadCallbackHandler. 
>  The result is that we then try to read the first 4 bytes again from the 
> buffer.  This will get a random message size and screw things up.  I'm not 
> sure if it's an incorrect use of the boost asio async_read function or we're 
> doing the boost bind to the callback function incorrectly.
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler system:0,512 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of buffer before reading message 
> size: 512 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of incoming message 599, 
> currently in buffer 508 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: Still have more data to read, 91 from 
> channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler system:0, 91 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of buffer before reading message 
> size: 599 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of incoming message 134287360, 
> currently in buffer 595 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: Still have more data to read, 
> 134286765 from channel(0x80b7a18)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-916) Problem receiving messages from subscribed channels in c++ client

2010-11-04 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-916:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed revision 1031453.


> Problem receiving messages from subscribed channels in c++ client 
> --
>
> Key: ZOOKEEPER-916
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-916
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-hedwig
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: ZOOKEEPER-916.patch
>
>
> We see this bug with receiving messages from a subscribed channel.  This 
> problem seems to happen with larger messages.  The flow is to first read at 
> least 4 bytes from the socket channel. Extract the first 4 bytes to get the 
> message size.  If we've read enough data into the buffer already, we're done 
> so invoke the messageReadCallbackHandler passing the channel and message 
> size.  If not, then do an async read for at least the remaining amount of 
> bytes in the message from the socket channel.  When done, invoke the 
> messageReadCallbackHandler.
> The problem seems that when the second async read is done, the same 
> sizeReadCallbackHandler is invoked instead of the messageReadCallbackHandler. 
>  The result is that we then try to read the first 4 bytes again from the 
> buffer.  This will get a random message size and screw things up.  I'm not 
> sure if it's an incorrect use of the boost asio async_read function or we're 
> doing the boost bind to the callback function incorrectly.
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler system:0,512 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of buffer before reading message 
> size: 512 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of incoming message 599, 
> currently in buffer 508 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: Still have more data to read, 91 from 
> channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler system:0, 91 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of buffer before reading message 
> size: 599 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: size of incoming message 134287360, 
> currently in buffer 595 channel(0x80b7a18)
> 101015 15:30:40.108 DEBUG hedwig.channel.cpp - 
> DuplexChannel::sizeReadCallbackHandler: Still have more data to read, 
> 134286765 from channel(0x80b7a18)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-862) Hedwig created ledgers with hardcoded Bookkeeper ensemble and quorum size. Make these a server config parameter instead.

2010-11-04 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-862:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1 looks good thanx Erwin

it looks like this was accidentally committed in r1031051

> Hedwig created ledgers with hardcoded Bookkeeper ensemble and quorum size.  
> Make these a server config parameter instead.
> -
>
> Key: ZOOKEEPER-862
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-862
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: contrib-hedwig
>Reporter: Erwin Tam
>Assignee: Erwin Tam
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-862.patch
>
>
> Hedwig code right now when using Bookkeeper as the persistence store is 
> hardcoding the number of bookie servers in the ensemble and quorum size.  
> This is used the first time a ledger is created.  This should be exposed as a 
> server configuration parameter instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-884) Remove LedgerSequence references from BookKeeper documentation and comments in tests

2010-11-04 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-884:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1 thanx flavio

Committed revision 1031433.


> Remove LedgerSequence references from BookKeeper documentation and comments 
> in tests 
> -
>
> Key: ZOOKEEPER-884
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-884
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-884.patch
>
>
> We no longer use LedgerSequence, so we need to remove references in 
> documentation and comments sprinkled throughout the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-11-04 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed resolved ZOOKEEPER-907.
-

Resolution: Fixed

Committed revision 1031051.
Committed revision 1031064.


> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-11-03 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928127#action_12928127
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

+1 looks great vishal thanx for the fix!

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-11-03 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-907:


Hadoop Flags: [Reviewed]

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-11-01 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927045#action_12927045
 ] 

Benjamin Reed commented on ZOOKEEPER-909:
-

the patch looks good. are you proposing that we commit it? or are you still 
working on it? i don't mind pushing off the javadoc for a bit if you think 
things might change. (although it would be nice to get that class more firmed 
up before we commit really...) we should get the property doc in before we 
commit since that will not change.

One other nit, if you are willing: calling the ClientCxnSocket "socket" and 
using "getSocket" is a bit confusing since ClientCnxnSocket does not extend 
socket. It's a bit more verbose, but more clear if you call the member and 
method "clientCxnSocket" and "getClientCnxnSocket".

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Thomas Koch
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-29 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926404#action_12926404
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

may i propose accepting this patch without a test case? (we can see that it 
fixes the problem.) that way we can get 3.3.2 out. once ZOOKEEPER-915 goes it 
the tests should cover this issue.

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-915) Errors that happen during sync() processing at the leader do not get propagated back to the client.

2010-10-28 Thread Benjamin Reed (JIRA)
Errors that happen during sync() processing at the leader do not get propagated 
back to the client.
---

 Key: ZOOKEEPER-915
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-915
 Project: Zookeeper
  Issue Type: Bug
Reporter: Benjamin Reed


If an error in sync() processing happens at the leader (SESSION_MOVED for 
example), they are not propagated back to the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-28 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925976#action_12925976
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

Ah, I see the problem. There are actually two problems: 1) when sync() get's an 
error it is not propagated back to the caller. 2) this problem.

They problem is that 1) is preventing us from writing a test case. We need to 
fix 1) and then we can write the test for 2).

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-27 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925540#action_12925540
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

ah got it. ok i was able to reproduce it: the client connects to the follower, 
issues a sync, the error message shows up in the log of the leader. so there is 
an additional bug here -- why is the client not getting the session moved error.

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-27 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925536#action_12925536
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

a sorry, i misunderstood the issue. so your client that issues the sync gets a 
zero return code, but you see that message in the log. right?

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-27 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925487#action_12925487
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

the test case should connect to a follower and do a sync.

i cannot reproduce this problem. if i connect to the leader or the follower and 
issue a sync() the return code is always 0.

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch, ZOOKEEPER-907.patch_v2
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-909) Extract NIO specific code from ClientCnxn

2010-10-22 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923905#action_12923905
 ] 

Benjamin Reed commented on ZOOKEEPER-909:
-

this is looking really nice. i'm not done reviewing, but i did want to note 
that you need to add the zookeeper.clientCxnSocket property to the doc. You 
should also javadoc that variable. 

> Extract NIO specific code from ClientCnxn
> -
>
> Key: ZOOKEEPER-909
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-909
> Project: Zookeeper
>  Issue Type: Sub-task
>  Components: java client
>Reporter: Thomas Koch
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-909.patch, ZOOKEEPER-909.patch, 
> ZOOKEEPER-909.patch
>
>
> This patch is mostly the same patch as my last one for ZOOKEEPER-823 minus 
> everything Netty related. This means this patch only extract all NIO specific 
> code in the class ClientCnxnSocketNIO which extends ClientCnxnSocket.
> I've redone this patch from current trunk step by step now and couldn't find 
> any logical error. I've already done a couple of successful test runs and 
> will continue to do so this night.
> It would be nice, if we could apply this patch as soon as possible to trunk. 
> This allows us to continue to work on the netty integration without blocking 
> the ClientCnxn class. Adding Netty after this patch should be only a matter 
> of adding the ClientCnxnSocketNetty class with the appropriate test cases.
> You could help me by reviewing the patch and by running it on whatever test 
> server you have available. Please send me any complete failure log you should 
> encounter to thomas at koch point ro. Thx!
> Update: Until now, I've collected 8 successful builds in a row!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-22 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923895#action_12923895
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

sync doesn't cause any additional traffic over the atomic broadcast. it just 
makes sure that the all of the in-process transactions have be sent to the 
follower. when that error happens, the error will be sent back to the follower 
ordered after all of the completed transactions. so rather than being able to 
see the result of all requests initiated before the sync, the follower will see 
all requests completed before the sync. that is why i referred to it as a 
partial sync.

i'm really having problems trying to reproduce this error. can you describe 
more how it happened? i would like to have an end-to-end test rather than the 
test of a particular implementation so that this error doesn't pop up if the 
implementation changes. looking at the code it seems like it should happen 
everytime the sync request is sent to a follower, but that doesn't seem to be 
the case.

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-20 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923211#action_12923211
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

wow, that is a really dumb error! vishal, could i ask you to modify your patch 
to avoid code duplication?
we should probably have "Request si;" outside of the if, and then just set si 
inside the if statement, and then do the setOwner and submit after the if block 
to avoid code duplication. that way, if we make another change in the future, 
we don't run into this again.

let me know if you need help with the test.

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-794) Callbacks are not invoked when the client is closed

2010-10-20 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-794:


Hadoop Flags: [Reviewed]

+1 thanx for sticking with this one Alexis!

> Callbacks are not invoked when the client is closed
> ---
>
> Key: ZOOKEEPER-794
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-794
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Alexis Midon
>Assignee: Alexis Midon
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-794.patch.txt, ZOOKEEPER-794.txt, 
> ZOOKEEPER-794_2.patch, ZOOKEEPER-794_3.patch, ZOOKEEPER-794_4.patch.txt, 
> ZOOKEEPER-794_5.patch.txt, ZOOKEEPER-794_5_br33.patch
>
>
> I noticed that ZooKeeper has different behaviors when calling synchronous or 
> asynchronous actions on a closed ZooKeeper client.
> Actually a synchronous call will throw a "session expired" exception while an 
> asynchronous call will do nothing. No exception, no callback invocation.
> Actually, even if the EventThread receives the Packet with the session 
> expired err code, the packet is never processed since the thread has been 
> killed by the ventOfDeath. So the call back is not invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-907) Spurious "KeeperErrorCode = Session moved" messages

2010-10-20 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923200#action_12923200
 ] 

Benjamin Reed commented on ZOOKEEPER-907:
-

yes, this will fail the sync. it will not get passed through the pipeline. it 
will give you a partial sync though :)

> Spurious "KeeperErrorCode = Session moved" messages
> ---
>
> Key: ZOOKEEPER-907
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-907
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-907.patch
>
>
> The sync request does not set the session owner in Request.
> As a result, the leader keeps printing:
> 2010-07-01 10:55:36,733 - INFO  [ProcessThread:-1:preprequestproces...@405] - 
> Got user-level KeeperException when processing sessionid:0x298d3b1fa9 
> type:sync: cxid:0x6 zxid:0xfffe txntype:unknown reqpath:/ Error 
> Path:null Error:KeeperErrorCode = Session moved

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-835) Refactoring Zookeeper Client Code

2010-10-19 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922813#action_12922813
 ] 

Benjamin Reed commented on ZOOKEEPER-835:
-

how do you see any of these things as related to ZOOKEEPER-22?

> Refactoring Zookeeper Client Code
> -
>
> Key: ZOOKEEPER-835
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-835
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Patrick Datko
>Assignee: Thomas Koch
>
> Thomas Koch asked me to fill individual issues for the points raised in his 
> mail to zookeeper-dev:
> [Mail of Thomas Koch| 
> http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-dev/201008.mbox/%3c20100845.17507.tho...@koch.ro%3e
>  ]
> He published several issues, which are present in the current zookeeper 
> client, so a refactoring of the code would be an facility for other 
> developers working with zookeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

2010-10-15 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921412#action_12921412
 ] 

Benjamin Reed commented on ZOOKEEPER-885:
-

we are having problems reproducing this. can you give a bit more details on the 
machines you are using? what are the cpu and memory size? also, what is the 
throughput of dd if=/dev/zero of=/dev/mapper/nimbula-test? is there just one 
disk, where nimbula-test is a partition on that disk and you have another 
partition for the snapshots and logs?

even if you don't have swap space, code pages can be discarded and loaded on 
demand, so that could be a potential problem. what does /proc/meminfo look like?

> Zookeeper drops connections under moderate IO load
> --
>
> Key: ZOOKEEPER-885
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.2.2, 3.3.1
> Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>Reporter: Alexandre Hardy
>Priority: Critical
> Attachments: benchmark.csv, tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twice

2010-10-14 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921233#action_12921233
 ] 

Benjamin Reed commented on ZOOKEEPER-881:
-

Committed revision 1022824.


> ZooKeeperServer.loadData loads database twice
> -
>
> Key: ZOOKEEPER-881
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-881.patch
>
>
> zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
> shouldn't have any negative affects, but is unnecessary.   A patch should be 
> trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-881) ZooKeeperServer.loadData loads database twice

2010-10-14 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-881:


Hadoop Flags: [Reviewed]

+1 nice catch!

> ZooKeeperServer.loadData loads database twice
> -
>
> Key: ZOOKEEPER-881
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-881
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Jared Cantwell
>Assignee: Jared Cantwell
>Priority: Trivial
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-881.patch
>
>
> zkDb.loadDataBase() is called twice at the beginning of loadData().  It 
> shouldn't have any negative affects, but is unnecessary.   A patch should be 
> trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-886) Hedwig Server stays in "disconnected" state when connection to ZK dies but gets reconnected

2010-10-11 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-886:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed revision 1021501.


> Hedwig Server stays in "disconnected" state when connection to ZK dies but 
> gets reconnected
> ---
>
> Key: ZOOKEEPER-886
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-886
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-hedwig
>Reporter: Erwin Tam
>Assignee: Erwin Tam
> Attachments: ZOOKEEPER-886.patch
>
>
> The Hedwig Server is connected to ZooKeeper.  In the ZkTopicManager, it 
> registers a watcher so that if it ever gets disconnected from ZK, it will 
> temporarily fail all incoming requests since the Hedwig server does not know 
> for sure if it is still the master for the topics.  When the ZK client gets 
> reconnected, the logic currently is wrong and it does not unset the suspended 
> flag.  Thus once it gets disconnected, it will stay in the suspended state 
> forever, thereby making the Hedwig server hub dead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-886) Hedwig Server stays in "disconnected" state when connection to ZK dies but gets reconnected

2010-10-11 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-886:


Hadoop Flags: [Reviewed]

+1 good catch erwin!

> Hedwig Server stays in "disconnected" state when connection to ZK dies but 
> gets reconnected
> ---
>
> Key: ZOOKEEPER-886
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-886
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-hedwig
>Reporter: Erwin Tam
>Assignee: Erwin Tam
> Attachments: ZOOKEEPER-886.patch
>
>
> The Hedwig Server is connected to ZooKeeper.  In the ZkTopicManager, it 
> registers a watcher so that if it ever gets disconnected from ZK, it will 
> temporarily fail all incoming requests since the Hedwig server does not know 
> for sure if it is still the master for the topics.  When the ZK client gets 
> reconnected, the logic currently is wrong and it does not unset the suspended 
> flag.  Thus once it gets disconnected, it will stay in the suspended state 
> forever, thereby making the Hedwig server hub dead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-864) Hedwig C++ client improvements

2010-10-11 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-864:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

thanx ivan!
Committed revision 1021463.


> Hedwig C++ client improvements
> --
>
> Key: ZOOKEEPER-864
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-864
> Project: Zookeeper
>  Issue Type: Improvement
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.4.0
>
> Attachments: warnings.txt, ZOOKEEPER-864.diff, ZOOKEEPER-864.diff, 
> ZOOKEEPER-864.diff, ZOOKEEPER-864.diff
>
>
> I changed the socket code to use boost asio. Now the client only creates one 
> thread, and all operations are non-blocking. 
> Tests are now automated, just run "make check".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-10-05 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-822:


Hadoop Flags: [Reviewed]

+1 looks good. ready to commit.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-880) QuorumCnxManager$SendWorker grows without bounds

2010-09-28 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915849#action_12915849
 ] 

Benjamin Reed commented on ZOOKEEPER-880:
-

is there an easy way to reproduce this?

> QuorumCnxManager$SendWorker grows without bounds
> 
>
> Key: ZOOKEEPER-880
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-880
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.2.2
>Reporter: Jean-Daniel Cryans
> Attachments: hbase-hadoop-zookeeper-sv4borg12.log.gz, 
> hbase-hadoop-zookeeper-sv4borg9.log.gz, jstack, 
> TRACE-hbase-hadoop-zookeeper-sv4borg9.log.gz
>
>
> We're seeing an issue where one server in the ensemble has a steady growing 
> number of QuorumCnxManager$SendWorker threads up to a point where the OS runs 
> out of native threads, and at the same time we see a lot of exceptions in the 
> logs.  This is on 3.2.2 and our config looks like:
> {noformat}
> tickTime=3000
> dataDir=/somewhere_thats_not_tmp
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=sv4borg9:2888:3888
> server.1=sv4borg10:2888:3888
> server.2=sv4borg11:2888:3888
> server.3=sv4borg12:2888:3888
> server.4=sv4borg13:2888:3888
> {noformat}
> The issue is on the first server. I'm going to attach threads dumps and logs 
> in moment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-820) update c unit tests to ensure "zombie" java server processes don't cause failure

2010-09-28 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915799#action_12915799
 ] 

Benjamin Reed commented on ZOOKEEPER-820:
-

+1 this looks good to me. did you try it on cygwin?

> update c unit tests to ensure "zombie" java server processes don't cause 
> failure
> 
>
> Key: ZOOKEEPER-820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-820
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-820-1.patch, ZOOKEEPER-820.patch
>
>
> When the c unit tests are run sometimes the server doesn't shutdown at the 
> end of the test, this causes subsequent tests (hudson esp) to fail.
> 1) we should try harder to make the server shut down at the end of the test, 
> I suspect this is related to test failing/cleanup
> 2) before the tests are run we should see if the old server is still running 
> and try to shut it down

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete

2010-09-28 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915796#action_12915796
 ] 

Benjamin Reed commented on ZOOKEEPER-822:
-

looks good overall flavio. just a quick questions: i notice that operations on 
senderWorkerMap in initiateConnection are not synchronized. senderWorkerMap is 
concurrent, but there could be a race between the get, put, and vsw.finish if 
initiateConnection is called concurrently for the same sid. right?

also you need to add a blurb to the config doc for the timeout system variable, 
which should be "zookeeper.cnxtimeout" so that it can be set from the 
configuration file.

> Leader election taking a long time  to complete
> ---
>
> Key: ZOOKEEPER-822
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.0
>Reporter: Vishal K
>Assignee: Vishal K
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, 
> test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822-3.3.2.patch, 
> ZOOKEEPER-822-3.3.2.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, ZOOKEEPER-822.patch, 
> ZOOKEEPER-822.patch_v1
>
>
> Created a 3 node cluster.
> 1 Fail the ZK leader
> 2. Let leader election finish. Restart the leader and let it join the 
> 3. Repeat 
> After a few rounds leader election takes anywhere 25- 60 seconds to finish. 
> Note- we didn't have any ZK clients and no new znodes were created.
> zoo.cfg is shown below:
> #Mon Jul 19 12:15:10 UTC 2010
> server.1=192.168.4.12\:2888\:3888
> server.0=192.168.4.11\:2888\:3888
> clientPort=2181
> dataDir=/var/zookeeper
> syncLimit=2
> server.2=192.168.4.13\:2888\:3888
> initLimit=5
> tickTime=2000
> I have attached logs from two nodes that took a long time to form the cluster 
> after failing the leader. The leader was down anyways so logs from that node 
> shouldn't matter.
> Look for "START HERE". Logs after that point should be of our interest.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-831) BookKeeper: Throttling improved for reads

2010-09-17 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-831:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed revision 998200.

thanx for the fix flavio and ivan for the reviews!

> BookKeeper: Throttling improved for reads
> -
>
> Key: ZOOKEEPER-831
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-831
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-831.patch, ZOOKEEPER-831.patch, 
> ZOOKEEPER-831.patch, ZOOKEEPER-831.patch
>
>
> Reads and writes in BookKeeper are asymmetric: a write request writes one 
> entry, whereas a read request may read multiple requests. The current 
> implementation of throttling only counts the number of read requests instead 
> of counting the number of entries being read. Consequently, a few read 
> requests reading a large number of entries each will spawn a large number of 
> read-entry requests. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-869) Support for election of leader with arbitrary zxid

2010-09-17 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910643#action_12910643
 ] 

Benjamin Reed commented on ZOOKEEPER-869:
-

this is a good observation diogo, but i think you may be characterizing it 
improperly. the problem is that when we do a leadership we increment the epoch 
and propose a new leader, so all other processes will be much lower than the 
leader. when a follower connects we figure out how far behind the follower is 
by comparing the lastProposed zxids and taking the difference. we should really 
be using the recent history to do the comparison.

as a side note, if we were to chose not to take the maximum zxid during 
recovery, we need to make sure that we still cover all committed messages.

> Support for election of leader with arbitrary zxid
> --
>
> Key: ZOOKEEPER-869
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-869
> Project: Zookeeper
>  Issue Type: New Feature
>Reporter: Diogo
>Priority: Minor
>
> Currently, the leader election algorithm implemented guarantees that the 
> leader has the maximum zxid of the ensemble. The state synchronization after 
> the election was built based on this assumption. However, other leader 
> elections algorithms might elect leaders with arbitrary zxid. 
> To support other leader election algorithms, the state synchronization should 
> allow the leader to have an arbitrary zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-846) zookeeper client doesn't shut down cleanly on the close call

2010-09-15 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-846:


Hadoop Flags: [Reviewed]

+1 looks good pat! it's nice that the checking and setting of closing is in the 
same routine. i agreed about skipping the test case.

> zookeeper client doesn't shut down cleanly on the close call
> 
>
> Key: ZOOKEEPER-846
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-846
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.2.2
>Reporter: Ted Yu
>Assignee: Patrick Hunt
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: rs-13.stack, ZOOKEEPER-846.patch
>
>
> Using HBase 0.20.6 (with HBASE-2473) we encountered a situation where 
> Regionserver
> process was shutting down and seemed to hang.
> Here is the bottom of region server log:
> http://pastebin.com/YYawJ4jA
> zookeeper-3.2.2 is used.
> Here is relevant portion from jstack - I attempted to attach jstack twice in 
> my email to d...@hbase.apache.org but failed:
> "DestroyJavaVM" prio=10 tid=0x2aabb849c800 nid=0x6c60 waiting on 
> condition [0x]
>java.lang.Thread.State: RUNNABLE
> "regionserver/10.32.42.245:60020" prio=10 tid=0x2aabb84ce000 nid=0x6c81 
> in Object.wait() [0x43755000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x2aaab76633c0> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
> at java.lang.Object.wait(Object.java:485)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1099)
> - locked <0x2aaab76633c0> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1077)
> at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:505)
> - locked <0x2aaabf5e0c30> (a org.apache.zookeeper.ZooKeeper)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.close(ZooKeeperWrapper.java:681)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:654)
> at java.lang.Thread.run(Thread.java:619)
> "main-EventThread" daemon prio=10 tid=0x43474000 nid=0x6c80 waiting 
> on condition [0x413f3000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x2aaabf6e9150> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:414)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

2010-09-10 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-785:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

committed to trunk and 3.3 branch. thanx you guys!

Committed revision 995845.
Committed revision 995844.


>  Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line
> ---
>
> Key: ZOOKEEPER-785
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-785
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
> Environment: Tested in linux with a new jvm
>Reporter: Alex Newman
>Assignee: Patrick Hunt
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-785.patch, ZOOKEEPER-785.patch
>
>
> The following config causes an infinite loop
> [zoo.cfg]
> tickTime=2000
> dataDir=/var/zookeeper/
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=localhost:2888:3888
> Output:
> 2010-06-01 16:20:32,471 - INFO [main:quorumpeerm...@119] - Starting quorum 
> peer
> 2010-06-01 16:20:32,489 - INFO [main:nioservercnxn$fact...@143] - binding to 
> port 0.0.0.0/0.0.0.0:2181
> 2010-06-01 16:20:32,504 - INFO [main:quorump...@818] - tickTime set to 2000
> 2010-06-01 16:20:32,504 - INFO [main:quorump...@829] - minSessionTimeout set 
> to -1
> 2010-06-01 16:20:32,505 - INFO [main:quorump...@840] - maxSessionTimeout set 
> to -1
> 2010-06-01 16:20:32,505 - INFO [main:quorump...@855] - initLimit set to 10
> 2010-06-01 16:20:32,526 - INFO [main:files...@82] - Reading snapshot 
> /var/zookeeper/version-2/snapshot.c
> 2010-06-01 16:20:32,547 - INFO [Thread-1:quorumcnxmanager$liste...@436] - My 
> election bind port: 3888
> 2010-06-01 16:20:32,554 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
> 2010-06-01 16:20:32,556 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
> id = 0, Proposed zxid = 12
> 2010-06-01 16:20:32,558 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
> 12, 1, 0, LOOKING, LOOKING, 0
> 2010-06-01 16:20:32,560 - WARN 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
> java.lang.NullPointerException
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
> 2010-06-01 16:20:32,560 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
> 2010-06-01 16:20:32,560 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
> id = 0, Proposed zxid = 12
> 2010-06-01 16:20:32,561 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
> 12, 2, 0, LOOKING, LOOKING, 0
> 2010-06-01 16:20:32,561 - WARN 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
> java.lang.NullPointerException
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
> 2010-06-01 16:20:32,561 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
> 2010-06-01 16:20:32,562 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
> id = 0, Proposed zxid = 12
> 2010-06-01 16:20:32,562 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
> 12, 3, 0, LOOKING, LOOKING, 0
> 2010-06-01 16:20:32,562 - WARN 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
> java.lang.NullPointerException
> Things like HBase require that the zookeeper servers be listed in the 
> zoo.cfg. This is a bug on their part, but zookeeper shouldn't null pointer in 
> a loop though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes

2010-08-24 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901949#action_12901949
 ] 

Benjamin Reed commented on ZOOKEEPER-366:
-

holger you are correct. nanoTime is the way to go. i'll prepare a fix. one 
problem with it is that the fix will be impossible to test.

> Session timeout detection can go wrong if the leader system time changes
> 
>
> Key: ZOOKEEPER-366
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Attachments: ZOOKEEPER-366.patch
>
>
> the leader tracks session expirations by calculating when a session will 
> timeout and then periodically checking to see what needs to be timed out 
> based on the current time. this works great as long as the leaders clock 
> progresses at a steady pace. the problem comes when there are big (session 
> size) changes in clock, by ntp for example. if time gets adjusted forward, 
> all the sessions could timeout immediately. if time goes backward sessions 
> that should timeout may take a lot longer to actually expire.
> this is really just a leader issue. the easiest way to deal with this is to 
> have the leader relinquish leadership if it detects a big jump forward in 
> time. when a new leader gets elected, it will recalculate timeouts of active 
> sessions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes

2010-08-20 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900824#action_12900824
 ] 

Benjamin Reed commented on ZOOKEEPER-366:
-

anyone have an idea of how to test this? i need to mock 
System.currentTimeMillis().

> Session timeout detection can go wrong if the leader system time changes
> 
>
> Key: ZOOKEEPER-366
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Attachments: ZOOKEEPER-366.patch
>
>
> the leader tracks session expirations by calculating when a session will 
> timeout and then periodically checking to see what needs to be timed out 
> based on the current time. this works great as long as the leaders clock 
> progresses at a steady pace. the problem comes when there are big (session 
> size) changes in clock, by ntp for example. if time gets adjusted forward, 
> all the sessions could timeout immediately. if time goes backward sessions 
> that should timeout may take a lot longer to actually expire.
> this is really just a leader issue. the easiest way to deal with this is to 
> have the leader relinquish leadership if it detects a big jump forward in 
> time. when a new leader gets elected, it will recalculate timeouts of active 
> sessions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes

2010-08-20 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-366:


Attachment: ZOOKEEPER-366.patch

this patch smooths out the effect of a radical time change by always sleeping 
at least 1/2 tickTime. this means that if we really needed to do a big jump 
forward, it will take up 1/2 of the jump to converge on the real time. because 
clients ping for idle times of 1/3 the timeout, there should be few sessions 
that expire. we could reduce that number, but take even longer to converge if 
we always sleep at least 3/4 of the tickTime.

> Session timeout detection can go wrong if the leader system time changes
> 
>
> Key: ZOOKEEPER-366
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Attachments: ZOOKEEPER-366.patch
>
>
> the leader tracks session expirations by calculating when a session will 
> timeout and then periodically checking to see what needs to be timed out 
> based on the current time. this works great as long as the leaders clock 
> progresses at a steady pace. the problem comes when there are big (session 
> size) changes in clock, by ntp for example. if time gets adjusted forward, 
> all the sessions could timeout immediately. if time goes backward sessions 
> that should timeout may take a lot longer to actually expire.
> this is really just a leader issue. the easiest way to deal with this is to 
> have the leader relinquish leadership if it detects a big jump forward in 
> time. when a new leader gets elected, it will recalculate timeouts of active 
> sessions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes

2010-08-19 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900511#action_12900511
 ] 

Benjamin Reed commented on ZOOKEEPER-366:
-

after discussion this on the list, we realized that we can detect a big jump in 
time change in the session expiration thread. since we expire a bucket of 
sessions each tick, if we run into the situation where we are going to expire 
more than one bucket in a row, we know we have jumped forward in time. we can 
"smooth" the jump by requiring at least a 1/2 ticktime wait between each 
bucket. 

> Session timeout detection can go wrong if the leader system time changes
> 
>
> Key: ZOOKEEPER-366
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Benjamin Reed
>
> the leader tracks session expirations by calculating when a session will 
> timeout and then periodically checking to see what needs to be timed out 
> based on the current time. this works great as long as the leaders clock 
> progresses at a steady pace. the problem comes when there are big (session 
> size) changes in clock, by ntp for example. if time gets adjusted forward, 
> all the sessions could timeout immediately. if time goes backward sessions 
> that should timeout may take a lot longer to actually expire.
> this is really just a leader issue. the easiest way to deal with this is to 
> have the leader relinquish leadership if it detects a big jump forward in 
> time. when a new leader gets elected, it will recalculate timeouts of active 
> sessions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-366) Session timeout detection can go wrong if the leader system time changes

2010-08-19 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed reassigned ZOOKEEPER-366:
---

Assignee: Benjamin Reed

> Session timeout detection can go wrong if the leader system time changes
> 
>
> Key: ZOOKEEPER-366
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-366
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
>
> the leader tracks session expirations by calculating when a session will 
> timeout and then periodically checking to see what needs to be timed out 
> based on the current time. this works great as long as the leaders clock 
> progresses at a steady pace. the problem comes when there are big (session 
> size) changes in clock, by ntp for example. if time gets adjusted forward, 
> all the sessions could timeout immediately. if time goes backward sessions 
> that should timeout may take a lot longer to actually expire.
> this is really just a leader issue. the easiest way to deal with this is to 
> have the leader relinquish leadership if it detects a big jump forward in 
> time. when a new leader gets elected, it will recalculate timeouts of active 
> sessions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-795) eventThread isn't shutdown after a connection "session expired" event coming

2010-08-17 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-795:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed revision 986470. in branch 3.3


> eventThread isn't shutdown after a connection "session expired" event coming
> 
>
> Key: ZOOKEEPER-795
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-795
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
> Environment: ubuntu 10.04
>Reporter: mathieu barcikowski
>Assignee: Sergey Doroshenko
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ExpiredSessionThreadLeak.java, ZOOKEEPER-795.patch, 
> ZOOKEEPER-795.patch
>
>
> Hi,
> I notice a problem with the eventThread located in ClientCnxn.java file.
> The eventThread isn't shutdown after a connection "session expired" event 
> coming (i.e. never receive EventOfDeath).
> When a session timeout occurs and the session is marked as expired, the 
> connexion is fully closed (socket, SendThread...) expect for the eventThread.
> As a result, if i create a new zookeeper object and connect through it, I got 
> a zombi thread which will never be kill (as for the previous zookeeper 
> object, the state is already close, calling close again don't do anything).
> So everytime I will create a new zookeeper connection after a expired 
> session, I will have a one more zombi EventThread.
> How to reproduce :
> - Start a zookeeper client connection in debug mode
> - Pause the jvm enough time to the expired event occur
> - Watch for example with jvisualvm the list of threads, the sendThread is 
> succesfully killed, but the EventThread go to wait state for a infinity of 
> time
> - if you reopen a new zookeeper connection, and do again the previous steps, 
> another EventThread will be present in infinite wait state

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-733) use netty to handle client connections

2010-08-16 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899101#action_12899101
 ] 

Benjamin Reed commented on ZOOKEEPER-733:
-

we should commit the patch as is. trying to add features to it and maintain the 
patch fresh is too unwieldy!

> use netty to handle client connections
> --
>
> Key: ZOOKEEPER-733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Benjamin Reed
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: accessive.jar, flowctl.zip, moved.zip, 
> QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch
>
>
> we currently have our own asynchronous NIO socket engine to be able to handle 
> lots of clients with a single thread. over time the engine has become more 
> complicated. we would also like the engine to use multiple threads on 
> machines with lots of cores. plus, we would like to be able to support things 
> like SSL. if we switch to netty, we can simplify our code and get the 
> previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-845) remove duplicate code from netty and nio ServerCnxn classes

2010-08-12 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897880#action_12897880
 ] 

Benjamin Reed commented on ZOOKEEPER-845:
-

perhaps we could extract the actual processing logic from the threading model.

> remove duplicate code from netty and nio ServerCnxn classes
> ---
>
> Key: ZOOKEEPER-845
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-845
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Benjamin Reed
> Fix For: 3.4.0
>
>
> the code for handling the 4-letter words is duplicated between the nio and 
> netty versions of ServerCnxn. this makes maintenance problematic. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-845) remove duplicate code from netty and nio ServerCnxn classes

2010-08-12 Thread Benjamin Reed (JIRA)
remove duplicate code from netty and nio ServerCnxn classes
---

 Key: ZOOKEEPER-845
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-845
 Project: Zookeeper
  Issue Type: Improvement
  Components: server
Reporter: Benjamin Reed


the code for handling the 4-letter words is duplicated between the nio and 
netty versions of ServerCnxn. this makes maintenance problematic. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-733) use netty to handle client connections

2010-08-12 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-733:


Hadoop Flags: [Reviewed]

+1 looks good to commit. Sergey raises a valid point, but i think it should be 
addressed in a separate jira given the size of this patch.

> use netty to handle client connections
> --
>
> Key: ZOOKEEPER-733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Benjamin Reed
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: accessive.jar, flowctl.zip, moved.zip, 
> QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch
>
>
> we currently have our own asynchronous NIO socket engine to be able to handle 
> lots of clients with a single thread. over time the engine has become more 
> complicated. we would also like the engine to use multiple threads on 
> machines with lots of cores. plus, we would like to be able to support things 
> like SSL. if we switch to netty, we can simplify our code and get the 
> previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-775) A large scale pub/sub system

2010-08-11 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897581#action_12897581
 ] 

Benjamin Reed commented on ZOOKEEPER-775:
-

i believe the NOTICE file is consistent with: 
http://apache.org/legal/src-headers.html#header-existingcopyright

> A large scale pub/sub system
> 
>
> Key: ZOOKEEPER-775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-775
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Fix For: 3.4.0
>
> Attachments: libs.zip, libs_2.zip, ZOOKEEPER-775.patch, 
> ZOOKEEPER-775.patch, ZOOKEEPER-775.patch, ZOOKEEPER-775_2.patch, 
> ZOOKEEPER-775_3.patch
>
>
> we have developed a large scale pub/sub system based on ZooKeeper and 
> BookKeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-794) Callbacks are not invoked when the client is closed

2010-08-10 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896948#action_12896948
 ] 

Benjamin Reed commented on ZOOKEEPER-794:
-

alexis, i'm missing the problem you are pointing out. is it an issue with the 
ordering of the callbacks?

i'm also wondering about your _3 patch. it is much smaller than the others. is 
it to be applied to trunk, or is it relative to a different patch?

> Callbacks are not invoked when the client is closed
> ---
>
> Key: ZOOKEEPER-794
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-794
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Alexis Midon
>Assignee: Alexis Midon
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-794.patch.txt, ZOOKEEPER-794.txt, 
> ZOOKEEPER-794_2.patch, ZOOKEEPER-794_3.patch
>
>
> I noticed that ZooKeeper has different behaviors when calling synchronous or 
> asynchronous actions on a closed ZooKeeper client.
> Actually a synchronous call will throw a "session expired" exception while an 
> asynchronous call will do nothing. No exception, no callback invocation.
> Actually, even if the EventThread receives the Packet with the session 
> expired err code, the packet is never processed since the thread has been 
> killed by the ventOfDeath. So the call back is not invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-338) zk hosts should be resolved periodically for loadbalancing amongst zk servers.

2010-08-10 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-338:


Component/s: java client
 (was: c client)

it is an issue for both the c and java clients.

> zk hosts should be resolved periodically for loadbalancing amongst zk servers.
> --
>
> Key: ZOOKEEPER-338
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-338
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: c client, java client
>Affects Versions: 3.0.0, 3.0.1, 3.1.0
>Reporter: Mahadev konar
>
> The list of host names passed to ZK init method is resolved only once. Had a 
> corresponding DNS entry been changed, it
> would not be refreshed by the ZK library,effectively preventing from proper 
> load balancing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-338) zk hosts should be resolved periodically for loadbalancing amongst zk servers.

2010-08-10 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-338:


Component/s: c client

> zk hosts should be resolved periodically for loadbalancing amongst zk servers.
> --
>
> Key: ZOOKEEPER-338
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-338
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: c client, java client
>Affects Versions: 3.0.0, 3.0.1, 3.1.0
>Reporter: Mahadev konar
>
> The list of host names passed to ZK init method is resolved only once. Had a 
> corresponding DNS entry been changed, it
> would not be refreshed by the ZK library,effectively preventing from proper 
> load balancing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-733) use netty to handle client connections

2010-08-09 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896573#action_12896573
 ] 

Benjamin Reed commented on ZOOKEEPER-733:
-

+1 if you would re-generate the patch, i'd like to get it committed.

> use netty to handle client connections
> --
>
> Key: ZOOKEEPER-733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Benjamin Reed
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: accessive.jar, flowctl.zip, moved.zip, 
> QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch
>
>
> we currently have our own asynchronous NIO socket engine to be able to handle 
> lots of clients with a single thread. over time the engine has become more 
> complicated. we would also like the engine to use multiple threads on 
> machines with lots of cores. plus, we would like to be able to support things 
> like SSL. if we switch to netty, we can simplify our code and get the 
> previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-829) Add /zookeeper/sessions/* to allow inspection/manipulation of client sessions

2010-07-29 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893910#action_12893910
 ] 

Benjamin Reed commented on ZOOKEEPER-829:
-

should we kill the session immediately or wait until the sessionTimeout. 
killing it immediate seems like it is violating a contract.

> Add /zookeeper/sessions/* to allow inspection/manipulation of client sessions
> -
>
> Key: ZOOKEEPER-829
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-829
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Todd Lipcon
>
> For some use cases in HBase (HBASE-1316 in particular) we'd like the ability 
> to forcible expire someone else's ZK session. Patrick and I discussed on IRC 
> and came up with an idea of creating nodes in /zookeeper/sessions/ id> that can be read in order to get basic stats about a session, and written 
> in order to manipulate one. The manipulation we need in HBase is the ability 
> to write a command like "kill", but others might be useful as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-795) eventThread isn't shutdown after a connection "session expired" event coming

2010-07-28 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-795:


Attachment: ZOOKEEPER-795.patch

i've added a test. (added to the existing session expiration test, so it 
shouldn't add any running time to the tests)

> eventThread isn't shutdown after a connection "session expired" event coming
> 
>
> Key: ZOOKEEPER-795
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-795
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
> Environment: ubuntu 10.04
>Reporter: mathieu barcikowski
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ExpiredSessionThreadLeak.java, ZOOKEEPER-795.patch, 
> ZOOKEEPER-795.patch
>
>
> Hi,
> I notice a problem with the eventThread located in ClientCnxn.java file.
> The eventThread isn't shutdown after a connection "session expired" event 
> coming (i.e. never receive EventOfDeath).
> When a session timeout occurs and the session is marked as expired, the 
> connexion is fully closed (socket, SendThread...) expect for the eventThread.
> As a result, if i create a new zookeeper object and connect through it, I got 
> a zombi thread which will never be kill (as for the previous zookeeper 
> object, the state is already close, calling close again don't do anything).
> So everytime I will create a new zookeeper connection after a expired 
> session, I will have a one more zombi EventThread.
> How to reproduce :
> - Start a zookeeper client connection in debug mode
> - Pause the jvm enough time to the expired event occur
> - Watch for example with jvisualvm the list of threads, the sendThread is 
> succesfully killed, but the EventThread go to wait state for a infinity of 
> time
> - if you reopen a new zookeeper connection, and do again the previous steps, 
> another EventThread will be present in infinite wait state

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-795) eventThread isn't shutdown after a connection "session expired" event coming

2010-07-28 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-795:


Status: Patch Available  (was: Open)

> eventThread isn't shutdown after a connection "session expired" event coming
> 
>
> Key: ZOOKEEPER-795
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-795
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
> Environment: ubuntu 10.04
>Reporter: mathieu barcikowski
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ExpiredSessionThreadLeak.java, ZOOKEEPER-795.patch, 
> ZOOKEEPER-795.patch
>
>
> Hi,
> I notice a problem with the eventThread located in ClientCnxn.java file.
> The eventThread isn't shutdown after a connection "session expired" event 
> coming (i.e. never receive EventOfDeath).
> When a session timeout occurs and the session is marked as expired, the 
> connexion is fully closed (socket, SendThread...) expect for the eventThread.
> As a result, if i create a new zookeeper object and connect through it, I got 
> a zombi thread which will never be kill (as for the previous zookeeper 
> object, the state is already close, calling close again don't do anything).
> So everytime I will create a new zookeeper connection after a expired 
> session, I will have a one more zombi EventThread.
> How to reproduce :
> - Start a zookeeper client connection in debug mode
> - Pause the jvm enough time to the expired event occur
> - Watch for example with jvisualvm the list of threads, the sendThread is 
> succesfully killed, but the EventThread go to wait state for a infinity of 
> time
> - if you reopen a new zookeeper connection, and do again the previous steps, 
> another EventThread will be present in infinite wait state

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-733) use netty to handle client connections

2010-07-27 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893025#action_12893025
 ] 

Benjamin Reed commented on ZOOKEEPER-733:
-

i ran this on 40 machines simulating 900 clients. the benchmark went well 
without problems. the results don't show any real significant performance 
improvements (or degradations).

> use netty to handle client connections
> --
>
> Key: ZOOKEEPER-733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Benjamin Reed
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: accessive.jar, flowctl.zip, moved.zip, 
> QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch
>
>
> we currently have our own asynchronous NIO socket engine to be able to handle 
> lots of clients with a single thread. over time the engine has become more 
> complicated. we would also like the engine to use multiple threads on 
> machines with lots of cores. plus, we would like to be able to support things 
> like SSL. if we switch to netty, we can simplify our code and get the 
> previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-733) use netty to handle client connections

2010-07-27 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-733:


Status: Patch Available  (was: Open)

> use netty to handle client connections
> --
>
> Key: ZOOKEEPER-733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Benjamin Reed
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: accessive.jar, flowctl.zip, moved.zip, 
> QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch
>
>
> we currently have our own asynchronous NIO socket engine to be able to handle 
> lots of clients with a single thread. over time the engine has become more 
> complicated. we would also like the engine to use multiple threads on 
> machines with lots of cores. plus, we would like to be able to support things 
> like SSL. if we switch to netty, we can simplify our code and get the 
> previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-733) use netty to handle client connections

2010-07-27 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-733:


Status: Open  (was: Patch Available)

> use netty to handle client connections
> --
>
> Key: ZOOKEEPER-733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: server
>Reporter: Benjamin Reed
>Assignee: Patrick Hunt
> Fix For: 3.4.0
>
> Attachments: accessive.jar, flowctl.zip, moved.zip, 
> QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch
>
>
> we currently have our own asynchronous NIO socket engine to be able to handle 
> lots of clients with a single thread. over time the engine has become more 
> complicated. we would also like the engine to use multiple threads on 
> machines with lots of cores. plus, we would like to be able to support things 
> like SSL. if we switch to netty, we can simplify our code and get the 
> previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-775) A large scale pub/sub system

2010-07-27 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892860#action_12892860
 ] 

Benjamin Reed commented on ZOOKEEPER-775:
-

can we do the forrest doc as a separate patch? it's already quite large as it 
is.

> A large scale pub/sub system
> 
>
> Key: ZOOKEEPER-775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-775
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Fix For: 3.4.0
>
> Attachments: libs.zip, libs_2.zip, ZOOKEEPER-775.patch, 
> ZOOKEEPER-775.patch, ZOOKEEPER-775_2.patch, ZOOKEEPER-775_3.patch
>
>
> we have developed a large scale pub/sub system based on ZooKeeper and 
> BookKeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-27 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-790:



+1 excellent work you guys. i also like QuorumUtil sergey! thanx for 
implementing it.

> Last processed zxid set prematurely while establishing leadership
> -
>
> Key: ZOOKEEPER-790
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-790-3.3.patch, ZOOKEEPER-790-3.3.patch, 
> ZOOKEEPER-790-follower-request-NPE.log, ZOOKEEPER-790-test.patch, 
> ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, 
> ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.travis.log.bz2, 
> ZOOKEEPER-790.v2.patch, ZOOKEEPER-790.v2.patch
>
>
> The leader code is setting the last processed zxid to the first of the new 
> epoch even before connecting to a quorum of followers. Because the leader 
> code sets this value before connecting to a quorum of followers 
> (Leader.java:281) and the follower code throws an IOException 
> (Follower.java:73) if the leader epoch is smaller, we have that when the 
> false leader drops leadership and becomes a follower, it finds a smaller 
> epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-702) GSoC 2010: Failure Detector Model

2010-07-26 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892649#action_12892649
 ] 

Benjamin Reed commented on ZOOKEEPER-702:
-

hey abmar this looks pretty cool. i think it will make the code much more 
clearer when we get this in.

i'm wondering about the asymmetrical nature of the current zk fd that you've 
already mentioned: one side is generating traffic and the other is responding. 
shouldn't that be reflected in the interface? it is a bit confusing when parts 
of the interface are used on one side and parts on the other.

on a separate note, i'm wondering if our use of TCP affects the failure 
detector. since TCP automatically handles lost packets, we aren't going to lose 
packets like we do with UDP based protocols.

> GSoC 2010: Failure Detector Model
> -
>
> Key: ZOOKEEPER-702
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
> Project: Zookeeper
>  Issue Type: Wish
>Reporter: Henry Robinson
>Assignee: Abmar Barros
> Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt, 
> chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, 
> ZOOKEEPER-702.patch, ZOOKEEPER-702.patch
>
>
> Failure Detector Module
> Possible Mentor
> Henry Robinson (henry at apache dot org)
> Requirements
> Java, some distributed systems knowledge, comfort implementing distributed 
> systems protocols
> Description
> ZooKeeper servers detects the failure of other servers and clients by 
> counting the number of 'ticks' for which it doesn't get a heartbeat from 
> other machines. This is the 'timeout' method of failure detection and works 
> very well; however it is possible that it is too aggressive and not easily 
> tuned for some more unusual ZooKeeper installations (such as in a wide-area 
> network, or even in a mobile ad-hoc network).
> This project would abstract the notion of failure detection to a dedicated 
> Java module, and implement several failure detectors to compare and contrast 
> their appropriateness for ZooKeeper. For example, Apache Cassandra uses a 
> phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which 
> is much more tunable and has some very interesting properties. This is a 
> great project if you are interested in distributed algorithms, or want to 
> help re-factor some of ZooKeeper's internal code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-22 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-790:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed revision 966960.
Committed revision 966984.

> Last processed zxid set prematurely while establishing leadership
> -
>
> Key: ZOOKEEPER-790
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.1
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-790-3.3.patch, ZOOKEEPER-790-3.3.patch, 
> ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, 
> ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.travis.log.bz2
>
>
> The leader code is setting the last processed zxid to the first of the new 
> epoch even before connecting to a quorum of followers. Because the leader 
> code sets this value before connecting to a quorum of followers 
> (Leader.java:281) and the follower code throws an IOException 
> (Follower.java:73) if the leader epoch is smaller, we have that when the 
> false leader drops leadership and becomes a follower, it finds a smaller 
> epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-22 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-790:


Hadoop Flags: [Reviewed]

+1 great job flavio! thanx for your help travis and vishal.

> Last processed zxid set prematurely while establishing leadership
> -
>
> Key: ZOOKEEPER-790
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.1
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-790-3.3.patch, ZOOKEEPER-790-3.3.patch, 
> ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, 
> ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.travis.log.bz2
>
>
> The leader code is setting the last processed zxid to the first of the new 
> epoch even before connecting to a quorum of followers. Because the leader 
> code sets this value before connecting to a quorum of followers 
> (Leader.java:281) and the follower code throws an IOException 
> (Follower.java:73) if the leader epoch is smaller, we have that when the 
> false leader drops leadership and becomes a follower, it finds a smaller 
> epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-790) Last processed zxid set prematurely while establishing leadership

2010-07-22 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891210#action_12891210
 ] 

Benjamin Reed commented on ZOOKEEPER-790:
-

looks great flavio! the only nit i have is that the test case assumes that s1 
is not the leader. you might want to check that.

> Last processed zxid set prematurely while establishing leadership
> -
>
> Key: ZOOKEEPER-790
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-790
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.3.1
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-790-3.3.patch, ZOOKEEPER-790.patch, 
> ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, ZOOKEEPER-790.patch, 
> ZOOKEEPER-790.travis.log.bz2
>
>
> The leader code is setting the last processed zxid to the first of the new 
> epoch even before connecting to a quorum of followers. Because the leader 
> code sets this value before connecting to a quorum of followers 
> (Leader.java:281) and the follower code throws an IOException 
> (Follower.java:73) if the leader epoch is smaller, we have that when the 
> false leader drops leadership and becomes a follower, it finds a smaller 
> epoch and kills itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-794) Callbacks are not invoked when the client is closed

2010-07-09 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-794:


Attachment: ZOOKEEPER-794_2.patch

i've added a test case and i think i've addressed the race condition. alexis 
can you check it out. the only change to your code was to make waskilled 
volatile and move where it was set.

> Callbacks are not invoked when the client is closed
> ---
>
> Key: ZOOKEEPER-794
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-794
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Alexis Midon
>Assignee: Alexis Midon
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-794.patch.txt, ZOOKEEPER-794.txt, 
> ZOOKEEPER-794_2.patch
>
>
> I noticed that ZooKeeper has different behaviors when calling synchronous or 
> asynchronous actions on a closed ZooKeeper client.
> Actually a synchronous call will throw a "session expired" exception while an 
> asynchronous call will do nothing. No exception, no callback invocation.
> Actually, even if the EventThread receives the Packet with the session 
> expired err code, the packet is never processed since the thread has been 
> killed by the ventOfDeath. So the call back is not invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-794) Callbacks are not invoked when the client is closed

2010-07-09 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-794:


Status: Patch Available  (was: Open)

> Callbacks are not invoked when the client is closed
> ---
>
> Key: ZOOKEEPER-794
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-794
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Alexis Midon
>Assignee: Alexis Midon
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-794.patch.txt, ZOOKEEPER-794.txt, 
> ZOOKEEPER-794_2.patch
>
>
> I noticed that ZooKeeper has different behaviors when calling synchronous or 
> asynchronous actions on a closed ZooKeeper client.
> Actually a synchronous call will throw a "session expired" exception while an 
> asynchronous call will do nothing. No exception, no callback invocation.
> Actually, even if the EventThread receives the Packet with the session 
> expired err code, the packet is never processed since the thread has been 
> killed by the ventOfDeath. So the call back is not invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-794) Callbacks are not invoked when the client is closed

2010-07-09 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-794:


Status: Open  (was: Patch Available)

-1 we need to get a test in. also the fix has a race condition. the boolean 
flag may changed after it is checked and before the request is queued.

> Callbacks are not invoked when the client is closed
> ---
>
> Key: ZOOKEEPER-794
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-794
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Alexis Midon
>Assignee: Alexis Midon
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-794.patch.txt, ZOOKEEPER-794.txt
>
>
> I noticed that ZooKeeper has different behaviors when calling synchronous or 
> asynchronous actions on a closed ZooKeeper client.
> Actually a synchronous call will throw a "session expired" exception while an 
> asynchronous call will do nothing. No exception, no callback invocation.
> Actually, even if the EventThread receives the Packet with the session 
> expired err code, the packet is never processed since the thread has been 
> killed by the ventOfDeath. So the call back is not invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-712) Bookie recovery

2010-07-09 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-712:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed revision 962697.


> Bookie recovery
> ---
>
> Key: ZOOKEEPER-712
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-712
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: contrib-bookkeeper
>Reporter: Flavio Paiva Junqueira
>Assignee: Erwin Tam
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-712.patch
>
>
> Recover the ledger fragments of a bookie once it crashes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-719) Add throttling to BookKeeper client

2010-07-09 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-719:


Status: Resolved  (was: Patch Available)
Resolution: Fixed

Committed revision 962693.

> Add throttling to BookKeeper client
> ---
>
> Key: ZOOKEEPER-719
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-719
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.0
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-719.patch, ZOOKEEPER-719.patch, 
> ZOOKEEPER-719.patch, ZOOKEEPER-719.patch
>
>
> Add throttling to client to control the rate of operations to bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-712) Bookie recovery

2010-07-09 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-712:


Hadoop Flags: [Reviewed]

+1 looks good. thanx erwin!

> Bookie recovery
> ---
>
> Key: ZOOKEEPER-712
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-712
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: contrib-bookkeeper
>Reporter: Flavio Paiva Junqueira
>Assignee: Erwin Tam
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-712.patch
>
>
> Recover the ledger fragments of a bookie once it crashes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-807) bookkeeper does not put enough meta-data in to do recovery properly

2010-07-09 Thread Benjamin Reed (JIRA)
bookkeeper does not put enough meta-data in to do recovery properly
---

 Key: ZOOKEEPER-807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-807
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bookkeeper
Reporter: Benjamin Reed


somewhere, probably zookeeper, we need to keep track of the the information 
about keys used for access and for mac validation as well as the digest type 
for entries. we can't write a general recovery tool without it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-806) Cluster management with Zookeeper - Norbert

2010-07-09 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886864#action_12886864
 ] 

Benjamin Reed commented on ZOOKEEPER-806:
-

this looks really cool. is there a collaboration model you were thinking of? 
(btw, have you guys thought of presenting this at the hadoop summit or similar 
venue?)

> Cluster management with Zookeeper - Norbert
> ---
>
> Key: ZOOKEEPER-806
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-806
> Project: Zookeeper
>  Issue Type: New Feature
>Reporter: John Wang
>
> Hello, we have built a cluster management layer on top of Zookeeper here at 
> the SNA team at LinkedIn: 
> http://sna-projects.com/norbert/
> We were wondering ways for collaboration as this is a very useful application 
> of zookeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-719) Add throttling to BookKeeper client

2010-07-07 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886019#action_12886019
 ] 

Benjamin Reed commented on ZOOKEEPER-719:
-

+1 looks good

> Add throttling to BookKeeper client
> ---
>
> Key: ZOOKEEPER-719
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-719
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.0
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-719.patch, ZOOKEEPER-719.patch, 
> ZOOKEEPER-719.patch, ZOOKEEPER-719.patch
>
>
> Add throttling to client to control the rate of operations to bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-719) Add throttling to BookKeeper client

2010-06-14 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878850#action_12878850
 ] 

Benjamin Reed commented on ZOOKEEPER-719:
-

i think using a system property is still the easiest, but i'm fine with the 
set/get if you want to do it. you just need to make it thread safe.

> Add throttling to BookKeeper client
> ---
>
> Key: ZOOKEEPER-719
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-719
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bookkeeper
>Affects Versions: 3.3.0
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-719.patch, ZOOKEEPER-719.patch
>
>
> Add throttling to client to control the rate of operations to bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

2010-06-09 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877131#action_12877131
 ] 

Benjamin Reed commented on ZOOKEEPER-767:
-

1) just to make sure we are talking about the same thing. this is the code i'm 
referring to:

{noformat}
// Check that we don't already have a lock...
if (currentExclusiveLock != null && !isExpired(currentExclusiveLock)) {
   // We have the exclusive lock! Remove newly made lock file and just
   // return.
   zooKeeper.delete(writeLock, -1);
   return currentExclusiveLock;
}
{noformat}

2) no, i'm talking about when you go to get the shared lock, you first check to 
see if you have a shared lock. shouldn't you check for both shared and 
exclusive?

3) the problem is that connection loss and session expiration are different. 
with connection loss you will get an exception, but your session can recover 
and you can keep using it. for session expired you are right the EPHEMERAL will 
go away. in the connection loss scenario you have a situation where you may 
acquire a lock but not know it.

with regard to the question of current lock implementation in the repository. 
i'm trying to understand the differences with that implementation and yours. 
both follow the same recipe right? if the current lock implementation 
implemented shared locks, would you have used that one? or is there something 
more fundamental?

> Submitting Demo/Recipe Shared / Exclusive Lock Code
> ---
>
> Key: ZOOKEEPER-767
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: recipes
>Affects Versions: 3.3.0
>Reporter: Sam Baskinger
>Assignee: Sam Baskinger
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-767.patch, ZOOKEEPER-767.patch
>
>
> Networked Insights would like to share-back some code for shared/exclusive 
> locking that we are using in our labs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

2010-06-03 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875227#action_12875227
 ] 

Benjamin Reed commented on ZOOKEEPER-785:
-

+1 i think we should log the message as a warning rather than error since we 
completely recover from the situation. we may also want to log a warning for 2 
servers to indicate that failures will not be tolerated. (feel free to ignore 
both comments and commit the patch :)

>  Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line
> ---
>
> Key: ZOOKEEPER-785
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-785
> Project: Zookeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.1
> Environment: Tested in linux with a new jvm
>Reporter: Alex Newman
>Assignee: Patrick Hunt
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-785.patch
>
>
> The following config causes an infinite loop
> [zoo.cfg]
> tickTime=2000
> dataDir=/var/zookeeper/
> clientPort=2181
> initLimit=10
> syncLimit=5
> server.0=localhost:2888:3888
> Output:
> 2010-06-01 16:20:32,471 - INFO [main:quorumpeerm...@119] - Starting quorum 
> peer
> 2010-06-01 16:20:32,489 - INFO [main:nioservercnxn$fact...@143] - binding to 
> port 0.0.0.0/0.0.0.0:2181
> 2010-06-01 16:20:32,504 - INFO [main:quorump...@818] - tickTime set to 2000
> 2010-06-01 16:20:32,504 - INFO [main:quorump...@829] - minSessionTimeout set 
> to -1
> 2010-06-01 16:20:32,505 - INFO [main:quorump...@840] - maxSessionTimeout set 
> to -1
> 2010-06-01 16:20:32,505 - INFO [main:quorump...@855] - initLimit set to 10
> 2010-06-01 16:20:32,526 - INFO [main:files...@82] - Reading snapshot 
> /var/zookeeper/version-2/snapshot.c
> 2010-06-01 16:20:32,547 - INFO [Thread-1:quorumcnxmanager$liste...@436] - My 
> election bind port: 3888
> 2010-06-01 16:20:32,554 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
> 2010-06-01 16:20:32,556 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
> id = 0, Proposed zxid = 12
> 2010-06-01 16:20:32,558 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
> 12, 1, 0, LOOKING, LOOKING, 0
> 2010-06-01 16:20:32,560 - WARN 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
> java.lang.NullPointerException
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
> 2010-06-01 16:20:32,560 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
> 2010-06-01 16:20:32,560 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
> id = 0, Proposed zxid = 12
> 2010-06-01 16:20:32,561 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
> 12, 2, 0, LOOKING, LOOKING, 0
> 2010-06-01 16:20:32,561 - WARN 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
> java.lang.NullPointerException
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
> at 
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
> 2010-06-01 16:20:32,561 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
> 2010-06-01 16:20:32,562 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
> id = 0, Proposed zxid = 12
> 2010-06-01 16:20:32,562 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
> 12, 3, 0, LOOKING, LOOKING, 0
> 2010-06-01 16:20:32,562 - WARN 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
> java.lang.NullPointerException
> Things like HBase require that the zookeeper servers be listed in the 
> zoo.cfg. This is a bug on their part, but zookeeper shouldn't null pointer in 
> a loop though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

2010-06-03 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-767:


Status: Open  (was: Patch Available)

> Submitting Demo/Recipe Shared / Exclusive Lock Code
> ---
>
> Key: ZOOKEEPER-767
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: recipes
>Affects Versions: 3.3.0
>Reporter: Sam Baskinger
>Assignee: Sam Baskinger
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-767.patch, ZOOKEEPER-767.patch
>
>
> Networked Insights would like to share-back some code for shared/exclusive 
> locking that we are using in our labs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

2010-06-03 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875221#action_12875221
 ] 

Benjamin Reed commented on ZOOKEEPER-767:
-

-1 there are a couple of problems with the implementation:

1) shouldn't you check to see if you already have a lock before you do the 
create? that will remove the code right after the create in the getLock() 
methods.

2) if you already have an exclusive lock, shouldn't that also count as a shared 
lock?

3) the error handling is a bit problematic. a connection loss exception or an 
interrupt can leave a process holding a lock without knowing it.

4) when you go through the children, you may end up checking for the existence 
of every znode before you, which could be wasteful.

i think it may be better to expand the current locking code to handle shared 
lock rather than add a new lock implementation. the current lock recipe 
implementation only does exclusive locks, but it is implemented in a way that 
makes it easy to support shared locks as well and it takes care of the above 
problems.

> Submitting Demo/Recipe Shared / Exclusive Lock Code
> ---
>
> Key: ZOOKEEPER-767
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: recipes
>Affects Versions: 3.3.0
>Reporter: Sam Baskinger
>Assignee: Sam Baskinger
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: ZOOKEEPER-767.patch, ZOOKEEPER-767.patch
>
>
> Networked Insights would like to share-back some code for shared/exclusive 
> locking that we are using in our labs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-775) A large scale pub/sub system

2010-06-03 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-775:


Status: Patch Available  (was: Open)

> A large scale pub/sub system
> 
>
> Key: ZOOKEEPER-775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-775
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Fix For: 3.4.0
>
> Attachments: libs.zip, libs_2.zip, ZOOKEEPER-775.patch, 
> ZOOKEEPER-775_2.patch, ZOOKEEPER-775_3.patch
>
>
> we have developed a large scale pub/sub system based on ZooKeeper and 
> BookKeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-733) use netty to handle client connections

2010-06-02 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-733:


Attachment: flowctl.zip

here is my cut at flowctl with netty. flow control seems to be happening, but 
it doesn't seem to fix the problem.

> use netty to handle client connections
> --
>
> Key: ZOOKEEPER-733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-733
> Project: Zookeeper
>  Issue Type: Improvement
>Reporter: Benjamin Reed
> Attachments: accessive.jar, flowctl.zip, moved.zip, 
> QuorumTestFailed_sessionmoved_TRACE_LOG.txt.gz, ZOOKEEPER-733.patch, 
> ZOOKEEPER-733.patch, ZOOKEEPER-733.patch
>
>
> we currently have our own asynchronous NIO socket engine to be able to handle 
> lots of clients with a single thread. over time the engine has become more 
> complicated. we would also like the engine to use multiple threads on 
> machines with lots of cores. plus, we would like to be able to support things 
> like SSL. if we switch to netty, we can simplify our code and get the 
> previously mentioned benefits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-775) A large scale pub/sub system

2010-06-01 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-775:


Attachment: ZOOKEEPER-775_3.patch
libs_2.zip

updated to address phunts comments.

> A large scale pub/sub system
> 
>
> Key: ZOOKEEPER-775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-775
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Fix For: 3.4.0
>
> Attachments: libs.zip, libs_2.zip, ZOOKEEPER-775.patch, 
> ZOOKEEPER-775_2.patch, ZOOKEEPER-775_3.patch
>
>
> we have developed a large scale pub/sub system based on ZooKeeper and 
> BookKeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-775) A large scale pub/sub system

2010-06-01 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874137#action_12874137
 ] 

Benjamin Reed commented on ZOOKEEPER-775:
-

i would like to fix the build once we have it in the subversion repository.

should i just remove the README? i'm not sure it is worth expanding since it 
would duplicate text in the docs directory

i'll fix the scripts and the dos2unix

with respect to the headers, i notice that configs, docs, and Makefiles don't 
have the license header in the zk repository, which leaves:

./pom.xml
./client/pom.xml
./protocol/pom.xml
./protocol/src/main/protobuf/PubSubProtocol.proto
./scripts/analyze.py
./scripts/hw.bash
./scripts/quote
./server/pom.xml

is it okay if i just do those?

> A large scale pub/sub system
> 
>
> Key: ZOOKEEPER-775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-775
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Benjamin Reed
>Assignee: Benjamin Reed
> Fix For: 3.4.0
>
> Attachments: libs.zip, ZOOKEEPER-775.patch, ZOOKEEPER-775_2.patch
>
>
> we have developed a large scale pub/sub system based on ZooKeeper and 
> BookKeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   3   4   5   6   7   8   >