Yip Ng created ZOOKEEPER-2425:
---------------------------------

             Summary: Slow ZooKeeper operation on Linux PowerPC
                 Key: ZOOKEEPER-2425
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2425
             Project: ZooKeeper
          Issue Type: Bug
          Components: c client
    Affects Versions: 3.4.8, 3.4.6
         Environment: Linux 2.6.32-431.el6.ppc64
            Reporter: Yip Ng


We ran into this performance issue in a production environment where 
the ZooKeeper operations(create, set, etc.) was very slow.
After some analysis we discover that there appears to be a TCP ACK delay of 
~40+ ms when sending a very small size message(length field) followed by its 
payload in ZooKeeper C client(v3.4.6) on a Linux PowerPC machine.  The Linux 
version is 2.6.32-431.el6.ppc64.   

Here is a sample of the strace that illustrates the delays:

39314 1459790083.730221 send(13, "\0\0\0e", 4, MSG_NOSIGNAL) = 4 <0.000010>
39314 1459790083.730259 send(13, 
"W\2\234\215\0\0\0\6\0\0\0Y/nnn.nnn.nn.nnnnnnn/"..., 101, MSG_NOSIGNAL) = 101 
<0.000009>
39314 1459790083.730297 poll([{fd=11, events=POLLIN}, {fd=13, events=POLLIN}], 
2, 13333) = 1 ([{fd=13, revents=POLLIN}]) <0.000204>
39314 1459790083.730535 recv(13, "\0\0\0\216", 4, 0) = 4 <0.000008>
39314 1459790083.730571 recv(13, 
"W\2\234\215\0\0\0\1\0\0\n7\0\0\0\0\0\0\0\1\0\0\0\37\0\0\0\6dige"..., 142, 0) = 
142 <0.000008>
39314 1459790083.730615 futex(0x1003bd90a2c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 
2147483647, 0x1003bd90a58, 2) = 1 <0.000010>
39279 1459790083.730647 <... futex resumed> ) = 0 <0.000488>
39314 1459790083.730659 poll([{fd=11, events=POLLIN}, {fd=13, events=POLLIN}], 
2, 13333 <unfinished ...>
39279 1459790083.730681 futex(0x1003bd90a58, FUTEX_WAKE_PRIVATE, 1) = 0 
<0.000012>
39279 1459790083.730737 write(12, "\0", 1 <unfinished ...>
39314 1459790083.730762 <... poll resumed> ) = 1 ([{fd=11, revents=POLLIN}]) 
<0.000091>
39279 1459790083.730776 <... write resumed> ) = 1 <0.000030>
39314 1459790083.730791 read(11,  <unfinished ...>
39279 1459790083.730809 futex(0x1003bd90a2c, FUTEX_WAIT_PRIVATE, 1, NULL 
<unfinished ...>
39314 1459790083.730819 <... read resumed> "\0", 128) = 1 <0.000021>
39314 1459790083.730838 poll([{fd=11, events=POLLIN}, {fd=13, 
events=POLLIN|POLLOUT}], 2, 13333) = 1 ([{fd=13, revents=POLLOUT}]) <0.000008>
39314 1459790083.730879 send(13, "\0\0\0f", 4, MSG_NOSIGNAL) = 4 <0.000010>
39314 1459790083.730917 send(13, 
"W\2\234\216\0\0\0\f\0\0\0Y/nnn.nnn.nn.nnnnnnn/"..., 102, MSG_NOSIGNAL) = 102 
<0.000010>
39314 1459790083.730961 poll([{fd=11, events=POLLIN}, {fd=13, events=POLLIN}], 
2, 13333) = 1 ([{fd=13, revents=POLLIN}]) <0.000189>
39314 1459790083.731184 recv(13, "\0\0\0k", 4, 0) = 4 <0.000007>
39314 1459790083.731219 recv(13, 
"W\2\234\216\0\0\0\1\0\0\n7\0\0\0\0\0\0\0\2\0\0\0\4stop\0\0\0\7"..., 107, 0) = 
107 <0.000008>
39314 1459790083.731264 futex(0x1003bd90a2c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 
2147483647, 0x1003bd90a58, 2) = 1 <0.000010>
39279 1459790083.731296 <... futex resumed> ) = 0 <0.000480>
39314 1459790083.731307 poll([{fd=11, events=POLLIN}, {fd=13, events=POLLIN}], 
2, 13333 <unfinished ...>
39279 1459790083.731329 futex(0x1003bd90a58, FUTEX_WAKE_PRIVATE, 1) = 0 
<0.000013>
39279 1459790083.731443 write(12, "\0", 1) = 1 <0.000019>
39314 1459790083.731485 <... poll resumed> ) = 1 ([{fd=11, revents=POLLIN}]) 
<0.000165>
39279 1459790083.731508 futex(0x1003bd90a2c, FUTEX_WAIT_PRIVATE, 1, NULL 
<unfinished ...>
39314 1459790083.731524 read(11, "\0", 128) = 1 <0.000009>
39314 1459790083.731557 poll([{fd=11, events=POLLIN}, {fd=13, 
events=POLLIN|POLLOUT}], 2, 13333) = 1 ([{fd=13, revents=POLLOUT}]) <0.000008>
39314 1459790083.731598 send(13, "\0\0\0k", 4, MSG_NOSIGNAL) = 4 <0.000010>
39314 1459790083.731636 send(13, 
"W\2\234\217\0\0\0\f\0\0\0^/nnn.nnn.nn.nnnnnnn/"..., 107, MSG_NOSIGNAL) = 107 
<0.000010>
39314 1459790083.731677 poll([{fd=11, events=POLLIN}, {fd=13, events=POLLIN}], 
2, 13333) = 1 ([{fd=13, revents=POLLIN}]) <0.000188>
39314 1459790083.731900 recv(13, "\0\0\0X", 4, 0) = 4 <0.000008>
39314 1459790083.731936 recv(13, 
"W\2\234\217\0\0\0\1\0\0\n7\0\0\0\0\0\0\0\0\0\0\0\1\0\0\n&\0\0\0\1"..., 88, 0) 
= 88 <0.000007> 

The fix is to collapse the 2 sends (length + payload) into a single send. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to