I have a total of 10 clients-nodes with 3-10 threads running on each node. Record size ~1K
Viv On Thu, Mar 24, 2011 at 8:28 PM, Ted Dunning <[email protected]> wrote: > Are you putting this data from a single host? Is your sender > multi-threaded? > > I note that (20 GB / 20 minutes < 20 MB / s) so you aren't particularly > stressing the network. You would likely be stressing a single threaded > client pretty severely. > > What is your record size? It may be that you are bound up by the number of > records being inserted rather than the total data size. > > On Thu, Mar 24, 2011 at 5:22 PM, Vivek Krishna <[email protected]>wrote: > >> Data Size - 20 GB. It took about an hour with default hbase setting and >> after varying several parameters, we were able to get this done in ~20 >> minutes. This is slow and we are trying to improve. >> >> We wrote a java client which would essentially `put` to hbase tables in >> batches. Our fine-tuning parameters include, >> 1. Disabling compaction >> 2. Varying batch sizes of put ( tried with 1000, 5000, 10000, 20000, >> 40000 >> ) >> 3. Setting AutoFlush to on/off. >> 4. Varying write buffer(in client) with 2mb, 128mb,256mb >> 5. Changing regionserver.handler.count to 100 >> 6. Varying regionserver size from 128 to 256/512/1024. >> 7. Increasing number of regions. >> 8. Creating regions with keys pre-specified (so that clients hit the >> regions directly) >> 9. Varying number of clients (from 30 clients to 100 clients) >> >> The above was tested on a 38 node cluster with 2 regions each. >> >> We did not try disabling WAL fearing loss of data. >> >> Are there any other parameters that we missed during the process? >> >> >> Viv >> > >
