[ https://issues.apache.org/jira/browse/ZOOKEEPER-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15276754#comment-15276754 ]
Patrick Hunt commented on ZOOKEEPER-2425: ----------------------------------------- Can you confirm that Nagle is indeed configured correctly (off) on that platform? e.g.: http://zookeeper-user.578899.n2.nabble.com/Zookeeper-don-t-use-Nagle-algorithm-gt-td7580682.html > Slow ZooKeeper operation on Linux PowerPC > ----------------------------------------- > > Key: ZOOKEEPER-2425 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2425 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Affects Versions: 3.4.6, 3.4.8 > Environment: Linux 2.6.32-431.el6.ppc64 > Reporter: Yip Ng > Attachments: ZOOKEEPER-2425-v3.4.patch > > > We ran into this performance issue in a production environment where > the ZooKeeper operations(create, set, etc.) was very slow. > After some analysis we discover that there appears to be a TCP ACK delay of > ~40+ ms when sending a very small size message(length field) followed by its > payload in ZooKeeper C client(v3.4.6) on a Linux PowerPC machine. The Linux > version is 2.6.32-431.el6.ppc64. > Here is a sample of the strace that illustrates the delays: > 39314 1459790083.730221 send(13, "\0\0\0e", 4, MSG_NOSIGNAL) = 4 <0.000010> > 39314 1459790083.730259 send(13, > "W\2\234\215\0\0\0\6\0\0\0Y/nnn.nnn.nn.nnnnnnn/"..., 101, MSG_NOSIGNAL) = 101 > <0.000009> > 39314 1459790083.730297 poll([{fd=11, events=POLLIN}, {fd=13, > events=POLLIN}], 2, 13333) = 1 ([{fd=13, revents=POLLIN}]) <0.000204> > 39314 1459790083.730535 recv(13, "\0\0\0\216", 4, 0) = 4 <0.000008> > 39314 1459790083.730571 recv(13, > "W\2\234\215\0\0\0\1\0\0\n7\0\0\0\0\0\0\0\1\0\0\0\37\0\0\0\6dige"..., 142, 0) > = 142 <0.000008> > 39314 1459790083.730615 futex(0x1003bd90a2c, FUTEX_CMP_REQUEUE_PRIVATE, 1, > 2147483647, 0x1003bd90a58, 2) = 1 <0.000010> > 39279 1459790083.730647 <... futex resumed> ) = 0 <0.000488> > 39314 1459790083.730659 poll([{fd=11, events=POLLIN}, {fd=13, > events=POLLIN}], 2, 13333 <unfinished ...> > 39279 1459790083.730681 futex(0x1003bd90a58, FUTEX_WAKE_PRIVATE, 1) = 0 > <0.000012> > 39279 1459790083.730737 write(12, "\0", 1 <unfinished ...> > 39314 1459790083.730762 <... poll resumed> ) = 1 ([{fd=11, revents=POLLIN}]) > <0.000091> > 39279 1459790083.730776 <... write resumed> ) = 1 <0.000030> > 39314 1459790083.730791 read(11, <unfinished ...> > 39279 1459790083.730809 futex(0x1003bd90a2c, FUTEX_WAIT_PRIVATE, 1, NULL > <unfinished ...> > 39314 1459790083.730819 <... read resumed> "\0", 128) = 1 <0.000021> > 39314 1459790083.730838 poll([{fd=11, events=POLLIN}, {fd=13, > events=POLLIN|POLLOUT}], 2, 13333) = 1 ([{fd=13, revents=POLLOUT}]) <0.000008> > 39314 1459790083.730879 send(13, "\0\0\0f", 4, MSG_NOSIGNAL) = 4 <0.000010> > 39314 1459790083.730917 send(13, > "W\2\234\216\0\0\0\f\0\0\0Y/nnn.nnn.nn.nnnnnnn/"..., 102, MSG_NOSIGNAL) = 102 > <0.000010> > 39314 1459790083.730961 poll([{fd=11, events=POLLIN}, {fd=13, > events=POLLIN}], 2, 13333) = 1 ([{fd=13, revents=POLLIN}]) <0.000189> > 39314 1459790083.731184 recv(13, "\0\0\0k", 4, 0) = 4 <0.000007> > 39314 1459790083.731219 recv(13, > "W\2\234\216\0\0\0\1\0\0\n7\0\0\0\0\0\0\0\2\0\0\0\4stop\0\0\0\7"..., 107, 0) > = 107 <0.000008> > 39314 1459790083.731264 futex(0x1003bd90a2c, FUTEX_CMP_REQUEUE_PRIVATE, 1, > 2147483647, 0x1003bd90a58, 2) = 1 <0.000010> > 39279 1459790083.731296 <... futex resumed> ) = 0 <0.000480> > 39314 1459790083.731307 poll([{fd=11, events=POLLIN}, {fd=13, > events=POLLIN}], 2, 13333 <unfinished ...> > 39279 1459790083.731329 futex(0x1003bd90a58, FUTEX_WAKE_PRIVATE, 1) = 0 > <0.000013> > 39279 1459790083.731443 write(12, "\0", 1) = 1 <0.000019> > 39314 1459790083.731485 <... poll resumed> ) = 1 ([{fd=11, revents=POLLIN}]) > <0.000165> > 39279 1459790083.731508 futex(0x1003bd90a2c, FUTEX_WAIT_PRIVATE, 1, NULL > <unfinished ...> > 39314 1459790083.731524 read(11, "\0", 128) = 1 <0.000009> > 39314 1459790083.731557 poll([{fd=11, events=POLLIN}, {fd=13, > events=POLLIN|POLLOUT}], 2, 13333) = 1 ([{fd=13, revents=POLLOUT}]) <0.000008> > 39314 1459790083.731598 send(13, "\0\0\0k", 4, MSG_NOSIGNAL) = 4 <0.000010> > 39314 1459790083.731636 send(13, > "W\2\234\217\0\0\0\f\0\0\0^/nnn.nnn.nn.nnnnnnn/"..., 107, MSG_NOSIGNAL) = 107 > <0.000010> > 39314 1459790083.731677 poll([{fd=11, events=POLLIN}, {fd=13, > events=POLLIN}], 2, 13333) = 1 ([{fd=13, revents=POLLIN}]) <0.000188> > 39314 1459790083.731900 recv(13, "\0\0\0X", 4, 0) = 4 <0.000008> > 39314 1459790083.731936 recv(13, > "W\2\234\217\0\0\0\1\0\0\n7\0\0\0\0\0\0\0\0\0\0\0\1\0\0\n&\0\0\0\1"..., 88, > 0) = 88 <0.000007> > The fix is to collapse the 2 sends (length + payload) into a single send. -- This message was sent by Atlassian JIRA (v6.3.4#6332)