Re: Socket timeouts in 0.8

Bob Jervis Fri, 22 Mar 2013 09:44:15 -0700

What are the number of network threads we should be running with a 2 broker
cluster (and replication=2)?  We have roughly 150-400 SimpleConsumers
running, depending on the application state.  We can spend some engineering
time consolidating many of the consumers, but the figure I''ve cited is for
our current test configuration.  We will have around 160 processes in the
production environment trying to read topics, so that is the bare minimum
we could reduce the connections to.  And we want to scale up from there
over the next year.  With our current architecture of a thread per topic,
we will have hundreds of SimpleConsumers chuggingaway (only a couple of
producers by the way, though we want to convert more of our data flow and
that will create more producers over time).


Thanks for your help,
Bob


On Thu, Mar 21, 2013 at 9:16 PM, Jun Rao <jun...@gmail.com> wrote:

> Bob,
>
> Currently, the metadata request needs to do at least one ZK read per
> partition. So the more topics/partitions you have, the longer the request
> takes. So, you need to increase the request timeout. Try something like 60
> * 1000 ms.
>
> Thanks,
>
> Jun
>
> On Thu, Mar 21, 2013 at 12:46 PM, Bob Jervis <bjer...@gmail.com> wrote:
>
>> We are seeing horrible problems.  We cannot move data through our 0.8
>> borker because we are getting socket timeout exceptions and I cannot
>> figure
>> out what settings should be.  The fetch metadata stuff is throwing these
>> exceptions and no matter how I tweak the timeouts, I still get horrible
>> timeouts and no progress on moving data.
>>
>> On test environments where there are only 12 topics there are no problems.
>>
>> When the number of topics goes to ~75, then we can't move anything because
>> the fetch metadata requests time out.
>>
>> What can we do to fix this?????????
>>
>> I am desperate.
>>
>
>

Re: Socket timeouts in 0.8

Reply via email to