Thank you, this was helpful. What about the number of splits for a table. Is there a general rule of thumb for how many splits and what size they should be when trying to balance ingest/query performance?
On Fri, Jul 15, 2016 at 2:38 PM, Emilio Lahr-Vivaz <[email protected]> wrote: > Another thing to consider is how many tablet servers the mutations are > being sent to - if they're all going to a single split, that's going to > reduce your throughput a lot. > > > On 07/15/2016 02:33 PM, [email protected] wrote: > > The batch writer has several knobs (latency time, memory buffer, etc) that > you can tune to meet your requirements. The values for those settings will > depend on a lot of variables, to include: > > - number of tablet servers > - size of mutations > - desired latency > - memory buffer > - configuration settings on the table(s) and tablet servers. > > Suggest picking a starting point and see how it works for you, such as > > threads - equal to the number of tablet servers (unless you have a > really large number of tablet servers) > buffer - 100MB > latency - 10 seconds > > If you are hitting a wall with those settings, you could increase the > buffer and latency and/or change some settings on the server side that have > to do with the write ahead logs. > > ------------------------------ > *From: *"Jamie Johnson" <[email protected]> <[email protected]> > *To: *[email protected] > *Sent: *Friday, July 15, 2016 2:16:40 PM > *Subject: *Configuring batch writers > > Is there any documentation that outlines reasonable settings for batch > writers given a known ingest rate? For instance if I have a source that is > producing in the neighborhood of 15MB of mutations per second, what would a > reasonable configuration for the batch writer be to handle an ingest at > this rate? What are reasonable rules of thumb to follow to ensure that the > writers don't block, etc? > > >
