Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2011-09-27 Thread memcached

Updates:
Status: Fixed

Comment #12 on issue 106 by dorma...@rydia.net: binary protocol parsing can  
cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

think this was merged up. closing.



Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2011-08-08 Thread memcached


Comment #10 on issue 106 by dorma...@rydia.net: binary protocol parsing can  
cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

The incoming packet is consumed in try_read_udp(), so if you get into a  
conn_close state after reading one packet, calling an empty recvfrom will  
consume the *next*, unseen packet.


Unless there is some case you're talking about which causes the next packet  
on the wire to be an error?


I did push my tree though, so take a look at what I ended up with:
https://github.com/dormando/memcached/commit/954f6dddc41c80d2e625bce63744df5556a425bb

... I can't break it with your tests anymore, and it doesn't hang before  
responding. If you look at how TCP connections are intialized, you'll see  
they're set to conn_new_cmd, but the UDP listeners are initialized to  
conn_read. The latter makes sense in the flow.


Either way, can you try the patch and see if you can still break it? :)



Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2011-08-08 Thread memcached


Comment #11 on issue 106 by pi3or...@gmail.com: binary protocol parsing can  
cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

I think you are right.

In my first patch (Comment 1), the recvfrom() call is critical because in  
that situation, if a UDP socket is in conn_closing state, it will never be  
recovered, all incoming packets should be dropped. If we recover its state  
to conn_read, there's no need for the recvfrom() again.





Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2011-08-07 Thread memcached

Updates:
Status: Started
Owner: trond.no...@gmail.com

Comment #8 on issue 106 by dorma...@rydia.net: binary protocol parsing can  
cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

I just pulled a patch similar to this into my 1.4.7 tree.

This was confusing me a bit... the extra recvfrom seemed pointless and it'd  
hang while running. The test from #158 would spin out of control without  
the recvfrom.


But the initial state of a UDP connection is actually conn_read, not  
conn_new_cmd. Using the correct state makes it find the next UDP packet and  
throw it away properly.


Still iffy on the rbytes bit, but so long as multiple command parsing works  
(which I don't think it does now anyway), that should be correct.


Punting to trond in case he wants to port this properly to 1.6.0, or take a  
look at least. I'll be pushing my for_147 branch to github at some point.




Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2011-06-27 Thread memcached


Comment #6 on issue 106 by dorma...@rydia.net: binary protocol parsing can  
cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

Can you issue a patch against 1.6.0-beta1? Or were you talking about 1.4.5,  
not 1.3.5?




Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2011-06-27 Thread memcached


Comment #7 on issue 106 by pi3or...@gmail.com: binary protocol parsing can  
cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

Ok. 1.6.0-beta1 still suffer from this problem, the principle is similar.  
Following patch solve it (I thought).



--- ./daemon/memcached.ori.c2011-06-28 09:43:45.12072 +0800
+++ ./daemon/memcached.c2011-06-28 09:57:18.39071 +0800
@@ -689,13 +689,19 @@
 c-sasl_conn = NULL;
 }

-c-engine_storage = NULL;
-c-tap_iterator = NULL;
-c-thread = NULL;
-assert(c-next == NULL);
-c-ascii_cmd = NULL;
-c-sfd = INVALID_SOCKET;
-c-tap_nack_mode = false;
+if (IS_UDP(c-transport)) {
+recvfrom(c-sfd, NULL, 0, 0, NULL, NULL);
+conn_set_state(c, conn_new_cmd);
+} else {
+c-sfd = INVALID_SOCKET;
+c-engine_storage = NULL;
+c-tap_iterator = NULL;
+c-thread = NULL;
+assert(c-next == NULL);
+c-ascii_cmd = NULL;
+c-tap_nack_mode = false;
+}
+
 }

 void conn_close(conn *c) {
@@ -4826,7 +4832,7 @@
 res -= 8;
 memmove(c-rbuf, c-rbuf + 8, res);

-c-rbytes += res;
+c-rbytes = res;
 c-rcurr = c-rbuf;
 return READ_DATA_RECEIVED;
 }




Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2011-01-24 Thread memcached


Comment #4 on issue 106 by airat.ha...@gmail.com: binary protocol parsing  
can cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

may be this specific issue is solved by the provided patch (haven't tried),
but in general the problem remains.
Even after applying the path suggested the following packet sent through  
udp hangs  memcached udp:

\x00\x00\x00\x00\x00\x01\x00\x00\x80\x05\x00\x01\x08\x00\x00\x00\x00\x00\x00\x0a\xef\xbe\xad\xde\x00\x00\x00\x00\x00\x00\x00\x00

(in fact any subsequent packet causes 100% usage)

Is is a storage_command (in terms of testapp.c), but the cmd (\x05,  
increment) is of type 'arithmetic_command' (in terms of testapp.c).

About this packet - all fields are 'correct' except for the type of cmd.
In the attachment - script that reproduces the livelock.

Attachments:
udp_livelock.py  931 bytes



Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2010-11-10 Thread memcached

Updates:
Owner: eric.d.lambert

Comment #3 on issue 106 by eric.d.lambert: binary protocol parsing can  
cause memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

this looks similar to the problem with issue 158, so i'll take a look at  
this




Re: Issue 106 in memcached: binary protocol parsing can cause memcached server lockup

2010-08-23 Thread memcached


Comment #1 on issue 106 by pi3orama: binary protocol parsing can cause  
memcached server lockup

http://code.google.com/p/memcached/issues/detail?id=106

The root cause of this issue is try_read_udp() never reset c-rbytes.

The processing of conn_nread minus c-rbytes by c-rlbytes, so if extra  
bytes exists, they will be 'left' in c-rbytes. However, each time when  
recvfrom()s UDP socket, the whole read buffer is overwritten, so 'c-rbytes  
+= res' sets c-rbytes incorrectly.


After the binary 'GET' command fail, the state of connection is transferred  
to 'conn_new_cmd' in write_bin_error(), then 'conn_parse_cmd' in  
reset_cmd_handler(). When c-rbytes less than sizeof(c-binary_header) (24  
bytes), the server believes that there's no enough bytes to hold binary  
header so the processing continues normally. However, in the processing of  
attached test program, c-rbytes increases by 1 each time a request is  
completed. Therefore, after 24 requests, server (wrongly) thinks there's an  
extra binary header, and it finally gets wrong magic in try_read_command().


The dead lock is caused by the error processing code. If the UDP connection  
is in conn_closing state, it never get closed but only calls  
conn_cleanup(). Therefore, when there are incoming packets in the udp  
socket, they will never be consumed, so epoll_wait() (in libevent) triggers  
input events over and over again for a closing connection.



To set c-rbytes correctly:
--- ./memcached.ori.c   2010-08-23 22:59:12.0 +0800
+++ ./memcached.c   2010-08-23 22:59:17.0 +0800
@@ -3223,7 +3223,7 @@
 res -= 8;
 memmove(c-rbuf, c-rbuf + 8, res);

-c-rbytes += res;
+c-rbytes = res;
c-rcurr = c-rbuf;
 return READ_DATA_RECEIVED;
 }


To consume incoming packets when cleanup:

--- ./memcached.ori.c   2010-08-23 22:59:12.0 +0800
+++ ./memcached.c   2010-08-23 23:27:38.0 +0800
@@ -471,6 +471,9 @@
 sasl_dispose(c-sasl_conn);
 c-sasl_conn = NULL;
 }
+
+/* consume incoming packets */
+recvfrom(c-sfd, NULL, 0, 0, NULL, NULL);
 }

 /*