What are the number of network threads we should be running with a 2 broker cluster (and replication=2)? We have roughly 150-400 SimpleConsumers running, depending on the application state. We can spend some engineering time consolidating many of the consumers, but the figure I''ve cited is for our current test configuration. We will have around 160 processes in the production environment trying to read topics, so that is the bare minimum we could reduce the connections to. And we want to scale up from there over the next year. With our current architecture of a thread per topic, we will have hundreds of SimpleConsumers chuggingaway (only a couple of producers by the way, though we want to convert more of our data flow and that will create more producers over time).
Thanks for your help, Bob On Thu, Mar 21, 2013 at 9:16 PM, Jun Rao <jun...@gmail.com> wrote: > Bob, > > Currently, the metadata request needs to do at least one ZK read per > partition. So the more topics/partitions you have, the longer the request > takes. So, you need to increase the request timeout. Try something like 60 > * 1000 ms. > > Thanks, > > Jun > > On Thu, Mar 21, 2013 at 12:46 PM, Bob Jervis <bjer...@gmail.com> wrote: > >> We are seeing horrible problems. We cannot move data through our 0.8 >> borker because we are getting socket timeout exceptions and I cannot >> figure >> out what settings should be. The fetch metadata stuff is throwing these >> exceptions and no matter how I tweak the timeouts, I still get horrible >> timeouts and no progress on moving data. >> >> On test environments where there are only 12 topics there are no problems. >> >> When the number of topics goes to ~75, then we can't move anything because >> the fetch metadata requests time out. >> >> What can we do to fix this????????? >> >> I am desperate. >> > >