Hey Alexander,
Was just poking at the code for this: it looks like this is really just
determining the number of mutations that get "processed together" (as
opposed to a hard limit).
Since you have done some work, I'm curious if you could generate some
data to help back up your suggestion:
* What does your table DDL look like?
* How large is one mutation you're writing (in bytes)?
* How much data ends up being sent to a RegionServer in one RPC?
You're right in that we would want to make sure that we're sending an
adequate amount of data to a RegionServer in an RPC, but this is tricky
to balance for all cases (thus, setting a smaller value to avoid sending
batches that are too large is safer).
On 9/3/19 8:03 AM, Alexander Batyrshin wrote:
Hello all,
1) There is bug in documentation - http://phoenix.apache.org/tuning.html
phoenix.mutate.batchSize is not 1000, but only 100 by default
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/query/QueryServicesOptions.java#L164
Changed for https://issues.apache.org/jira/browse/PHOENIX-541
2) I want to discuss this default value. From PHOENIX-541
<https://issues.apache.org/jira/browse/PHOENIX-541> I read about issue
with MR and wide rows (2MB per row) and it looks like rare case. But in
most common cases we can get much better write perfomance with batchSize
= 1000 especially if it used with SALT table