James,

Thanks for the response.  

There could be a dozen or so users accessing the system and the same portions 
of the tables.  The motive for salting has been to eliminate hot spotting - our 
data is time-series based and that is what the PK is based on.  

Thanks,
Ralph




On 6/8/15, 10:00 AM, "James Taylor" <[email protected]> wrote:

>Hi Ralph,
>What kind of workload do you expect on your cluster? Will there be
>many users accessing many different parts of your table(s)
>simultaneously? Have you considered not salting your tables? Or do you
>have hot spotting issues at write time due to the layout of your PK
>that salting is preventing? With the advent of table stats
>(http://phoenix.apache.org/update_statistics.html), Phoenix is able to
>parallelize queries along equal chunks of data, similar to the what
>occurs with salting.
>
>The downside of salting is for queries that are only accessing a
>handful of rows. Because Phoenix doesn't know which salt bucket
>contains which of these rows, a scan always needs always be run for
>every salt bucket. If you have 100 salt buckets, this is 100 scans
>(worst case loading 100 blocks) versus a single scan for the unsalted
>case (loading a single block). This will impact the throughput you
>see.
>
>I'd encourage you to use Pherf (http://phoenix.apache.org/pherf.html)
>to test salting (over multiple salt bucket sizes) versus unsalted for
>realistic scenarios to get an accurate asssesment for your workload.
>
>Thanks,
>James
>
>On Mon, Jun 8, 2015 at 9:34 AM, Perko, Ralph J <[email protected]> wrote:
>> Hi – following up on this.
>>
>> Is it generally recommended to roughly match the salt bucket count to region
>> server count?  Or is it more arbitrary?  Should I use something like 255
>> because the regions are going to split anyway?
>>
>> Thanks,
>> Ralph
>>
>>
>> From: "Perko, Ralph J"
>> Reply-To: "[email protected]"
>> Date: Friday, June 5, 2015 at 11:39 AM
>> To: "[email protected]"
>> Subject: Salt bucket count recommendation
>>
>> Hi,
>>
>> We have a 40 node cluster with 8 core tables and around 35 secondary index
>> tables.  The tables get very large – billions of records and terabytes of
>> data.  What salt bucket count do you recommend?
>>
>> Thanks,
>> Ralph
>>

Reply via email to