[jira] Updated: (ZOOKEEPER-559) valgrind warnings running zkpython bindings

2009-10-26 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-559:
---

Attachment: valgrind-zk.tar.gz

valgrind output for zkpython bindings running zk-smoketest scripts (smoketest 
and latencytest)


 valgrind warnings running zkpython bindings
 ---

 Key: ZOOKEEPER-559
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-559
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: valgrind-zk.tar.gz


 I'm seeing some weird behavior running zk-latencies.py
 http://github.com/phunt/zk-smoketest
 don't know if it's related to zkbindings itself, but I ran valgrind to see if 
 it noticed any issues. see attached.
 afaict these issues are related to zkpython binding, however I'm not sure. I 
 did run valgrind against the
 zookeeper c library tests and these issues were not highlighted. So I'm 
 thinking this is zkpython errors, however
 I'm not 100% sure. 
 Henry can you take a look?
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-559) valgrind warnings running zkpython bindings

2009-10-26 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12770154#action_12770154
 ] 

Patrick Hunt commented on ZOOKEEPER-559:


here's something additional from the zkpython unit tests (many of these):

==6532== Conditional jump or move depends on uninitialised value(s)
==6532==at 0x807D692: PyInt_FromLong (intobject.c:89)
==6532==by 0x80F8C30: do_mkvalue (modsupport.c:333)
==6532==by 0x80F8DC1: do_mkvalue (modsupport.c:179)
==6532==by 0x80F9534: va_build_value (modsupport.c:536)
==6532==by 0x80F95B2: Py_BuildValue (modsupport.c:484)
==6532==by 0x40360BE: build_stat (zookeeper.c:209)
==6532==by 0x403980F: pyzoo_get (zookeeper.c:870)
==6532==by 0x8061119: PyObject_Call (abstract.c:2492)
==6532==by 0x80DB1CC: PyEval_EvalFrameEx (ceval.c:4005)
==6532==by 0x80E00B7: PyEval_EvalCodeEx (ceval.c:2968)
==6532==by 0x80DE5F7: PyEval_EvalFrameEx (ceval.c:3802)
==6532==by 0x80E00B7: PyEval_EvalCodeEx (ceval.c:2968)
==6532== 
==6532== Use of uninitialised value of size 4
==6532==at 0x807D6C8: PyInt_FromLong (intobject.c:91)
==6532==by 0x80F8C30: do_mkvalue (modsupport.c:333)
==6532==by 0x80F8DC1: do_mkvalue (modsupport.c:179)
==6532==by 0x80F9534: va_build_value (modsupport.c:536)
==6532==by 0x80F95B2: Py_BuildValue (modsupport.c:484)
==6532==by 0x40360BE: build_stat (zookeeper.c:209)
==6532==by 0x403980F: pyzoo_get (zookeeper.c:870)
==6532==by 0x8061119: PyObject_Call (abstract.c:2492)
==6532==by 0x80DB1CC: PyEval_EvalFrameEx (ceval.c:4005)
==6532==by 0x80E00B7: PyEval_EvalCodeEx (ceval.c:2968)
==6532==by 0x80DE5F7: PyEval_EvalFrameEx (ceval.c:3802)
==6532==by 0x80E00B7: PyEval_EvalCodeEx (ceval.c:2968)


 valgrind warnings running zkpython bindings
 ---

 Key: ZOOKEEPER-559
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-559
 Project: Zookeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.0
Reporter: Patrick Hunt
Assignee: Henry Robinson
 Fix For: 3.3.0

 Attachments: valgrind-zk.tar.gz


 I'm seeing some weird behavior running zk-latencies.py
 http://github.com/phunt/zk-smoketest
 don't know if it's related to zkbindings itself, but I ran valgrind to see if 
 it noticed any issues. see attached.
 afaict these issues are related to zkpython binding, however I'm not sure. I 
 did run valgrind against the
 zookeeper c library tests and these issues were not highlighted. So I'm 
 thinking this is zkpython errors, however
 I'm not 100% sure. 
 Henry can you take a look?
  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled

2009-10-26 Thread Patrick Hunt (JIRA)
c client can flood server with pings if tcp send queue filled
-

 Key: ZOOKEEPER-562
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.1
Reporter: Patrick Hunt
Assignee: Mahadev konar
Priority: Blocker
 Fix For: 3.2.2, 3.3.0


The c client can flood the server with pings if the tcp queue is filled.

Say the cluster is overloaded and shuts down the recv processing

a c client can send a ping, but since last_send is only updated on successful 
pushing of data into the 
socket, if flush_send_queue fails to send any data (send_buffer returns 0) then 
last_send is not updated
and zookeeper_interest will again send a ping the next time it is woken - which 
could be 0 if recv_to is close
to 0, easily could happen if server is not sending data to the client.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled

2009-10-26 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12770291#action_12770291
 ] 

Patrick Hunt commented on ZOOKEEPER-562:


send_ping is calling wake_io_thread itself, so this is a particularly bad 
situation (forces a tight loop)

solution is to update last_send as last_send_attempt when attempting to send, 
whether successful or not.


 c client can flood server with pings if tcp send queue filled
 -

 Key: ZOOKEEPER-562
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.1
Reporter: Patrick Hunt
Assignee: Mahadev konar
Priority: Blocker
 Fix For: 3.2.2, 3.3.0


 The c client can flood the server with pings if the tcp queue is filled.
 Say the cluster is overloaded and shuts down the recv processing
 a c client can send a ping, but since last_send is only updated on successful 
 pushing of data into the 
 socket, if flush_send_queue fails to send any data (send_buffer returns 0) 
 then last_send is not updated
 and zookeeper_interest will again send a ping the next time it is woken - 
 which could be 0 if recv_to is close
 to 0, easily could happen if server is not sending data to the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled

2009-10-26 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt reassigned ZOOKEEPER-562:
--

Assignee: Benjamin Reed  (was: Mahadev konar)

Assigning to Ben. 

We should verify that something like this can't happen in the java either. From 
my looking
it seems not, but would be good to have addl verification.


 c client can flood server with pings if tcp send queue filled
 -

 Key: ZOOKEEPER-562
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.1
Reporter: Patrick Hunt
Assignee: Benjamin Reed
Priority: Blocker
 Fix For: 3.2.2, 3.3.0


 The c client can flood the server with pings if the tcp queue is filled.
 Say the cluster is overloaded and shuts down the recv processing
 a c client can send a ping, but since last_send is only updated on successful 
 pushing of data into the 
 socket, if flush_send_queue fails to send any data (send_buffer returns 0) 
 then last_send is not updated
 and zookeeper_interest will again send a ping the next time it is woken - 
 which could be 0 if recv_to is close
 to 0, easily could happen if server is not sending data to the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-550) Java Queue Recipe

2009-10-26 Thread Steven Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Cheng updated ZOOKEEPER-550:
---

Status: In Progress  (was: Patch Available)

 Java Queue Recipe
 -

 Key: ZOOKEEPER-550
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-550
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.2.1
Reporter: Steven Cheng
Assignee: Steven Cheng
Priority: Minor
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch


 This patch adds a recipe for creating a distributed queue with ZooKeeper 
 similar to the WriteLock recipe and some sequential tests.  This early 
 attempt follows the Java BlockingQueue interface, though it doesn't implement 
 it since I don't think there's a good reason for it to be Iterable.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-550) Java Queue Recipe

2009-10-26 Thread Steven Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Cheng updated ZOOKEEPER-550:
---

Status: Patch Available  (was: In Progress)

 Java Queue Recipe
 -

 Key: ZOOKEEPER-550
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-550
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.2.1
Reporter: Steven Cheng
Assignee: Steven Cheng
Priority: Minor
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch


 This patch adds a recipe for creating a distributed queue with ZooKeeper 
 similar to the WriteLock recipe and some sequential tests.  This early 
 attempt follows the Java BlockingQueue interface, though it doesn't implement 
 it since I don't think there's a good reason for it to be Iterable.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-550) Java Queue Recipe

2009-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12770382#action_12770382
 ] 

Hadoop QA commented on ZOOKEEPER-550:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12423133/ZOOKEEPER-550.patch
  against trunk revision 828216.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 60 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 185 release audit warnings 
(more than the trunk's current 179 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/39/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/39/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/39/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/39/console

This message is automatically generated.

 Java Queue Recipe
 -

 Key: ZOOKEEPER-550
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-550
 Project: Zookeeper
  Issue Type: New Feature
  Components: java client
Affects Versions: 3.2.1
Reporter: Steven Cheng
Assignee: Steven Cheng
Priority: Minor
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, 
 ZOOKEEPER-550.patch, ZOOKEEPER-550.patch, ZOOKEEPER-550.patch


 This patch adds a recipe for creating a distributed queue with ZooKeeper 
 similar to the WriteLock recipe and some sequential tests.  This early 
 attempt follows the Java BlockingQueue interface, though it doesn't implement 
 it since I don't think there's a good reason for it to be Iterable.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Possible race in LETest.java

2009-10-26 Thread Henry Robinson
I've been working on adding a TCPResponderThread to the leader election
process so that if a deployment needs to be TCP only, it can be and still
use all election types. Testing this has exposed what might be a race
condition in the leader election code that prevents a leader from being
elected.

Here's the behaviour I see in LETest occasionally. With three nodes (reduced
from 30 for ease of debugging), node 3 gets elected before either node 1 or
node 2 finish their election (there is one round where each node that 3 has
the highest id, and then 3 completes its second round by receiving votes for
itself from 1 and 2, but 1 and 2 do not receive votes from 3).

Now 3 is killed by the test harness. 1 and 2 are still voting for it, but
every time they try, the vote tally excludes 3 since it hasn't been heard
from. They then spin round the voting process, unable to reset their vote. I
expect that the heartbeat mechanism in a running QuorumPeer takes care of
this when the leader is lost, but the associated QuorumPeers aren't running.

If this is the case, then there is a simple fix to reset the nodes vote to
themselves if they are voting for a node that hasn't been heard from. I
don't know why using TCP instead of UDP for the responder thread is
exacerbating this (and we can't rule out my introducing a bug :)); but as
it's a race condition the different timings associated with waiting on a TCP
socket might just be enough to expose the issue.

Can someone verify this might be possible / figure out what I missed?

cheers,
Henry


[jira] Updated: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled

2009-10-26 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-562:


Status: Patch Available  (was: Open)

 c client can flood server with pings if tcp send queue filled
 -

 Key: ZOOKEEPER-562
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.1
Reporter: Patrick Hunt
Assignee: Benjamin Reed
Priority: Blocker
 Fix For: 3.2.2, 3.3.0

 Attachments: ZOOKEEPER-562.patch


 The c client can flood the server with pings if the tcp queue is filled.
 Say the cluster is overloaded and shuts down the recv processing
 a c client can send a ping, but since last_send is only updated on successful 
 pushing of data into the 
 socket, if flush_send_queue fails to send any data (send_buffer returns 0) 
 then last_send is not updated
 and zookeeper_interest will again send a ping the next time it is woken - 
 which could be 0 if recv_to is close
 to 0, easily could happen if server is not sending data to the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled

2009-10-26 Thread Benjamin Reed (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-562:


Attachment: ZOOKEEPER-562.patch

this patch fixes the problem by only sending a ping if there isn't something 
already queued. the test checks for clients sending gratuitous pings.

 c client can flood server with pings if tcp send queue filled
 -

 Key: ZOOKEEPER-562
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.1
Reporter: Patrick Hunt
Assignee: Benjamin Reed
Priority: Blocker
 Fix For: 3.2.2, 3.3.0

 Attachments: ZOOKEEPER-562.patch


 The c client can flood the server with pings if the tcp queue is filled.
 Say the cluster is overloaded and shuts down the recv processing
 a c client can send a ping, but since last_send is only updated on successful 
 pushing of data into the 
 socket, if flush_send_queue fails to send any data (send_buffer returns 0) 
 then last_send is not updated
 and zookeeper_interest will again send a ping the next time it is woken - 
 which could be 0 if recv_to is close
 to 0, easily could happen if server is not sending data to the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-562) c client can flood server with pings if tcp send queue filled

2009-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12770398#action_12770398
 ] 

Hadoop QA commented on ZOOKEEPER-562:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12423282/ZOOKEEPER-562.patch
  against trunk revision 828216.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/40/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/40/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/40/console

This message is automatically generated.

 c client can flood server with pings if tcp send queue filled
 -

 Key: ZOOKEEPER-562
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-562
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.1
Reporter: Patrick Hunt
Assignee: Benjamin Reed
Priority: Blocker
 Fix For: 3.2.2, 3.3.0

 Attachments: ZOOKEEPER-562.patch


 The c client can flood the server with pings if the tcp queue is filled.
 Say the cluster is overloaded and shuts down the recv processing
 a c client can send a ping, but since last_send is only updated on successful 
 pushing of data into the 
 socket, if flush_send_queue fails to send any data (send_buffer returns 0) 
 then last_send is not updated
 and zookeeper_interest will again send a ping the next time it is woken - 
 which could be 0 if recv_to is close
 to 0, easily could happen if server is not sending data to the client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.