I'm consuming the Streaming API using the filter method (tracking some user ids). I've noticed that I'm getting an extra, undocumented, line before each length delimiter.
I connect and get the following coming down the pipe: {{{ HTTP/1.1 200 OK Content-Type: application/json Transfer-Encoding: chunked Server: Jetty(6.1.17) 5DE 1496 {"coordinates":null, ... snip ..., "id":10487365330} A52 2636 {"coordinates":null, ...snip ..., "id":10487377907} 592 1420 {"coordinates":null, ... snip ..., "id":10487298462} }}} Now, the Streaming API docs say, "Statuses are represented by a length, in bytes, a newline, and the status text that is exactly length bytes. Note that "keep-alive" newlines may be inserted before each length." This suggests the following read loop code (based on and equivalent to the way tweepy's consumer is implemented): {{{ length = '' while True: c = s.recv(1) if c == '\n': break length += c length = length.strip() if length.isdigit(): length = int(length) status_data = s.recv(length) # do something with the data }}} However, if you look at the third status data from above, you see that the extra line can sometimes be a digit, in that case ``592``. Which fairly effectively borkes the consumer. Now, I can hack that read loop in quite a few ways to accomodate this extra data coming down the pipe. Question is, what's the best way to do so? Is this something I can rely on, e.g.: I can look for a line above the length delimiter? Will it always have three chars? Do statuses always have > 1000 bytes? Plus I'm wondering whether this has always been the case, or if there are broken consumers missing tweets out there? Thanks, James.