I'm not sure it is the same problem, but last time I had an hanging in the
TTransport part it was due to a DNS misconfiguration that lead to big
delays in all functions based on the dns resolver.

On Sun, 23 May 2021 at 14:12, Buster, James
<james.bus...@transunion.com.invalid> wrote:

> My server gets permanently hung after seeing this internal exception, from
> lib/cpp/src/thrift/transport/TBufferTransports.h:
>
>   void consume(uint32_t len) {
>     countConsumedMessageBytes(len);
>     if (TDB_LIKELY(static_cast<ptrdiff_t>(len) <= rBound_ - rBase_)) {
>       rBase_ += len;
>     } else {
>       throw TTransportException(TTransportException::BAD_ARGS, "consume
> did not follow a borrow.");
>     }
>   }
>
> Once this happens the server becomes unresponsive and all new clients
> connect and then hang until TCP times out.
> The thread stuck in epoll_wait acts as if it's ignoring everything after
> the connection is established. It can take anywhere
> from 10 minutes to 23 hours of heavy use before this hang condition
> occurs, so it's hard to predict and there's no clear
> test case (because if I had one I presumably could make it hang
> immediately).
>

Reply via email to