Hey Dormando,

This looks great!  Unfortunately, I'm out of town all of this week, but 
if John or Darryl doesn't jump all over this by the time I get back, 
I'll do my best to carve out some time.

This approach makes a lot of sense - thanks again.

Don


dormando wrote:
> Hey John!
> 
> (Don/Darryl, read this too, please?)
> 
> Thanks for the detailed report. I've been busy with work all week and 
> wanted to give this a non-hand-waivy response.
> 
> As far as connection issues go, and that specific explanation of the 
> SYN/ACK actually happening but really late, I believe this is the only 
> possible fix:
> 
> http://consoleninja.net/gitweb/gitweb.cgi?p=memcached.git;a=commitdiff;h=99bbd00c3f948b795ef83ef76fcbb00d92151e60
> 
> Been kicking around stable tree tonight:
> 
> http://consoleninja.net/gitweb/gitweb.cgi?p=memcached.git;a=shortlog;h=stable
> 
> ... I'll hopefully send a followup soon about the facebook patches. Bear 
> with us for now :)
> 
> You didn't see any TCP retries during that window, etc? My gut tells me 
> this fix will repair some timeouts, but the odds of accept not kicking in 
> for a full second are too low.
> 
> If you're running single-threaded, you'll need to switch to multi-threaded 
> to get this benefit. A dedicated accept thread is a common design pattern 
> memcached just hadn't adopted until just now.
> 
> Any chance you/Don/etc could run the stable tree on a box or two and see 
> if it removes *or* reduces the connection timeout?
> 
> -Dormando
> 
> On Tue, 19 Aug 2008, John Allspaw wrote:
> 
>> HelloHello.
>>
>> We've seen connection issues with memcached for a while now, and the cause
>> is elusive. I'd love for it to be a fault in the network, and have been
>> biased in looking for that to be the cause, but I can't find anything up in
>> tcp/ip land to be the culprit.
>>
>> The client logs an error like this:
>> www100.flickr [19/Aug/2008:13:47:41 +0000] [error] [client x.x.x.x]
>> [app_warn] [php] WARNING: connect() [<a
>> href='function.connect'>function.connect</a>]: Can't connect to woe1:11211,
>> Connection failed (0) in <php script> line 287
>>
>> A tcpdump shows it in the wild: client sends a SYN, memcached server takes
>>> 5 seconds to return a SYN/ACK, at which point the client gives memcached
>> the finger via an RST packet:
>>
>> No.     Time        Source                Destination           Protocol src
>> port dst port Info
>>    165 1.262255    209.191.105.168       68.142.214.227        TCP
>> 9048     11211    9048 > 11211 [SYN] Seq=0 Len=0 MSS=1460 WS=8
>>    738 5.093003    68.142.214.227        209.191.105.168       TCP
>> 11211    9048     11211 > 9048 [SYN, ACK] Seq=0 Ack=1 Win=373760 Len=0
>> MSS=1460 WS=6
>>    739 5.093016    209.191.105.168       68.142.214.227        TCP
>> 9048     11211    9048 > 11211 [RST] Seq=1 Len=0
>>
>> The client has PECL php client memcache-3.0.1, server is memcached-1.2.6.
>>
>> An mtr run for 24 hours shows no packet loss between the two machines, and
>> the issue isn't port/switch/host specific, since we see this same issue from
>> all of our front-end machines, across all of our memcached servers, which
>> span several racks and switches.  The client connects via IP, so no DNS is
>> needed.
>>
>> No firewalls/iptables/connection tracking/etc running on either client or
>> server.
>>
>> Any thoughts? We're handling the connection failures, but it's annoying and
>> I can't help but think there's something stupid going on.
>> More detail on both client and server below.
>>
>> thanks,
>> allspaw
>>
>>
>> server is: RHEL 4U2 2.6.9-22.ELsmp #1 SMP Mon Sep 19 18:32:14 EDT 2005 i686
>> i686 i386 GNU/Linux
>> client is: RHEL 4U4 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686
>> i686 i386 GNU/Linux
>>
>> lsmod for server is:
>> Module                  Size  Used by
>> md5                     8001  1
>> ipv6                  240097  609
>> i2c_dev                14273  0
>> i2c_core               25921  1 i2c_dev
>> nfs                   199205  1
>> lockd                  65257  2 nfs
>> sunrpc                139173  4 nfs,lockd
>> dm_mirror              28449  0
>> dm_mod                 58949  1 dm_mirror
>> uhci_hcd               32729  0
>> ehci_hcd               31813  0
>> e1000                  96429  0
>> floppy                 58065  0
>> aic79xx               187485  0
>> ext3                  118729  1
>> jbd                    59481  1 ext3
>> sata_sil               12869  0
>> ata_piix               13253  0
>> libata                 47901  2 sata_sil,ata_piix
>> megaraid_mbox          37073  0
>> megaraid_mm            17905  1 megaraid_mbox
>> sd_mod                 20545  0
>> scsi_mod              116429  4 aic79xx,libata,megaraid_mbox,sd_mod
>>
>> lsmod for client is:
>> Module                  Size  Used by
>> ylock                  17568  2
>> md5                     8001  1
>> ipv6                  241761  28
>> i2c_dev                14273  0
>> i2c_core               25921  1 i2c_dev
>> button                 10449  0
>> battery                12869  0
>> ac                      8773  0
>> joydev                 14209  0
>> uhci_hcd               32729  0
>> ehci_hcd               32069  0
>> tg3                   100933  0
>> dm_snapshot            21093  0
>> dm_zero                 6337  0
>> dm_mirror              31645  0
>> ext3                  118729  4
>> jbd                    59609  1 ext3
>> dm_mod                 60357  12 dm_snapshot,dm_zero,dm_mirror
>> mptscsih                5569  0
>> mptsas                 13389  3 mptscsih
>> mptspi                 13261  1 mptscsih
>> mptfc                  12617  1 mptscsih
>> mptscsi                44125  3 mptsas,mptspi,mptfc
>> mptbase                61345  4 mptsas,mptspi,mptfc,mptscsi
>> sd_mod                 20545  3
>> scsi_mod              117709  5 mptsas,mptspi,mptfc,mptscsi,sd_mod
>>
>> sysctl -a | grep tcp for both client and server shows:
>> sunrpc.tcp_slot_table_entries = 16
>> net.ipv4.tcp_bic_beta = 819
>> net.ipv4.tcp_tso_win_divisor = 8
>> net.ipv4.tcp_moderate_rcvbuf = 1
>> net.ipv4.tcp_bic_low_window = 14
>> net.ipv4.tcp_bic_fast_convergence = 1
>> net.ipv4.tcp_bic = 1
>> net.ipv4.tcp_vegas_gamma = 2
>> net.ipv4.tcp_vegas_beta = 6
>> net.ipv4.tcp_vegas_alpha = 2
>> net.ipv4.tcp_vegas_cong_avoid = 0
>> net.ipv4.tcp_westwood = 0
>> net.ipv4.tcp_no_metrics_save = 0
>> net.ipv4.tcp_low_latency = 0
>> net.ipv4.tcp_frto = 0
>> net.ipv4.tcp_tw_reuse = 0
>> net.ipv4.tcp_adv_win_scale = 2
>> net.ipv4.tcp_app_win = 31
>> net.ipv4.tcp_rmem = 8192        873800  8738000
>> net.ipv4.tcp_wmem = 4096        655360  6553600
>> net.ipv4.tcp_mem = 786432       1048576 1572864
>> net.ipv4.tcp_dsack = 1
>> net.ipv4.tcp_ecn = 0
>> net.ipv4.tcp_reordering = 3
>> net.ipv4.tcp_fack = 1
>> net.ipv4.tcp_orphan_retries = 0
>> net.ipv4.tcp_max_syn_backlog = 8192
>> net.ipv4.tcp_rfc1337 = 0
>> net.ipv4.tcp_stdurg = 0
>> net.ipv4.tcp_abort_on_overflow = 0
>> net.ipv4.tcp_tw_recycle = 1
>> net.ipv4.tcp_syncookies = 1
>> net.ipv4.tcp_fin_timeout = 10
>> net.ipv4.tcp_retries2 = 15
>> net.ipv4.tcp_retries1 = 3
>> net.ipv4.tcp_keepalive_intvl = 75
>> net.ipv4.tcp_keepalive_probes = 9
>> net.ipv4.tcp_keepalive_time = 7200
>> net.ipv4.tcp_max_tw_buckets = 180000
>> net.ipv4.tcp_max_orphans = 262144
>> net.ipv4.tcp_synack_retries = 5
>> net.ipv4.tcp_syn_retries = 5
>> net.ipv4.tcp_retrans_collapse = 0
>> net.ipv4.tcp_sack = 1
>> net.ipv4.tcp_window_scaling = 1
>> net.ipv4.tcp_timestamps = 0
>> fs.nfs.nlm_tcpport = 0
>>
>> --
>> John Allspaw
>> http://flickr.com/photos/allspaw
>>
> 
> > 
> 

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/memcached?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to