Ah, thanks Eric, that answers my question. It sounds like using the proxy server for batch_scans and ingest is a bit beyond its scope. Are there plans for beefing up the proxy to handle a wider range of purposes from multiple clients?
Thanks, David On Mon, Apr 14, 2014 at 11:06 AM, Eric Newton <[email protected]> wrote: > High ingest and batch scans use resources within the proxy for queuing > data. If I was using a proxy for these activities, I would want to > have a proxy for each client. Administrative requests, and even basic > single-range scans are simple pass-throughs with a much lower chance > of overloading the proxy. > > > On Mon, Apr 14, 2014 at 9:56 AM, David Medinets > <[email protected]> wrote: >> "number of proxy servers should be proportional to the number of clients" - >> I hate to be pedantic but >> this is a very general statement. Can you be more specific? Should the >> proportion be 1:1 or 5:1? What factors affect the ratio? >> >> >> On Mon, Apr 14, 2014 at 9:32 AM, Eric Newton <[email protected]> wrote: >>> >>> The number of proxy servers should be proportional to the number of >>> clients. >>> >>> The proxy can talk to all the tablet servers, but the client of the >>> proxy only has the proxy to make requests on its behalf. >>> >>> As always, it's going to depend on what you want to do, what your >>> schema looks like, and the total number of servers you have. >>> >>> -Eric >>> >>> On Sun, Apr 13, 2014 at 11:58 PM, David O'Gwynn <[email protected]> wrote: >>> > Hi community, >>> > >>> > I was reading a thread "Error stressing with pyaccumulo app" from >>> > February, and the topic of optimal number of proxy servers for a >>> > cluster of a given size came up. Does anyone have any insight into >>> > that question? Is there a thread in the archive that addresses this >>> > question directly? >>> > >>> > My gut tells me that you should have a number proportional to the >>> > number of tablet servers, but I'm afraid I don't really understand >>> > what the proxy server is doing. >> >>
