High ingest and batch scans use resources within the proxy for queuing data. If I was using a proxy for these activities, I would want to have a proxy for each client. Administrative requests, and even basic single-range scans are simple pass-throughs with a much lower chance of overloading the proxy.
On Mon, Apr 14, 2014 at 9:56 AM, David Medinets <[email protected]> wrote: > "number of proxy servers should be proportional to the number of clients" - > I hate to be pedantic but > this is a very general statement. Can you be more specific? Should the > proportion be 1:1 or 5:1? What factors affect the ratio? > > > On Mon, Apr 14, 2014 at 9:32 AM, Eric Newton <[email protected]> wrote: >> >> The number of proxy servers should be proportional to the number of >> clients. >> >> The proxy can talk to all the tablet servers, but the client of the >> proxy only has the proxy to make requests on its behalf. >> >> As always, it's going to depend on what you want to do, what your >> schema looks like, and the total number of servers you have. >> >> -Eric >> >> On Sun, Apr 13, 2014 at 11:58 PM, David O'Gwynn <[email protected]> wrote: >> > Hi community, >> > >> > I was reading a thread "Error stressing with pyaccumulo app" from >> > February, and the topic of optimal number of proxy servers for a >> > cluster of a given size came up. Does anyone have any insight into >> > that question? Is there a thread in the archive that addresses this >> > question directly? >> > >> > My gut tells me that you should have a number proportional to the >> > number of tablet servers, but I'm afraid I don't really understand >> > what the proxy server is doing. > >
