Re: Any reason for so small phoenix.mutate.batchSize by default?

Josh Elser Tue, 03 Sep 2019 07:19:49 -0700

Hey Alexander,

Was just poking at the code for this: it looks like this is really justdetermining the number of mutations that get "processed together" (asopposed to a hard limit).

Since you have done some work, I'm curious if you could generate somedata to help back up your suggestion:


* What does your table DDL look like?
* How large is one mutation you're writing (in bytes)?
* How much data ends up being sent to a RegionServer in one RPC?

You're right in that we would want to make sure that we're sending anadequate amount of data to a RegionServer in an RPC, but this is trickyto balance for all cases (thus, setting a smaller value to avoid sendingbatches that are too large is safer).


On 9/3/19 8:03 AM, Alexander Batyrshin wrote:

  Hello all,

1) There is bug in documentation - http://phoenix.apache.org/tuning.html
phoenix.mutate.batchSize is not 1000, but only 100 by default
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/query/QueryServicesOptions.java#L164
Changed for https://issues.apache.org/jira/browse/PHOENIX-541
2) I want to discuss this default value. From PHOENIX-541<https://issues.apache.org/jira/browse/PHOENIX-541> I read about issuewith MR and wide rows (2MB per row) and it looks like rare case. But inmost common cases we can get much better write perfomance with batchSize= 1000 especially if it used with SALT table

Re: Any reason for so small phoenix.mutate.batchSize by default?

Reply via email to