Thanks a lot for your help!

I see that everyone is really against the "many small tables" setup. Is
that not efficient? I did it that way because having many tables really
felt more natural, but I can give it a shot and see if that helps.

Isart

On Wed, Jun 3, 2015 at 9:10 PM, Alex Kamil <[email protected]> wrote:

> Isart, you can try eliminate joins by embedding smaller tables into larger
> ones wherever possible, we do this with ARRAY
> <https://phoenix.apache.org/array_type.html> (and writing some UDFs
> <https://phoenix.apache.org/udf.html> for refined filtering in these
> nested tables), or as James suggested Views might be helpful.
>
> Alex
>
> On Wed, Jun 3, 2015 at 10:02 AM, James Taylor <[email protected]>
> wrote:
>
>> Rather than use a SALT_BUCKET of 2, just don't salt the table at all. It
>> never makes sense to have a SALT_BUCKET of 1, though.
>>
>> How many total tables do you have? Are you using views at all (
>> http://phoenix.apache.org/views.html)?
>>
>> Thanks,
>> James
>>
>> On Wednesday, June 3, 2015, Puneet Kumar Ojha <[email protected]>
>> wrote:
>>
>>>  Do not use SALT_BUCKET=32 for smaller join table. Use salt number as 1
>>> or 2.
>>>
>>> Increase the handler count to 60.  Recommended RAM is atleast 16GB / RS.
>>>
>>>
>>>
>>> Your join query performance should increase and cluster will be stable.
>>>
>>>
>>>
>>> *From:* Isart Montane [mailto:[email protected]]
>>> *Sent:* Wednesday, June 03, 2015 4:44 PM
>>> *To:* [email protected]
>>> *Subject:* Recommendations on phoenix setup
>>>
>>>
>>>
>>> Hi,
>>>
>>> I would like to use Phoenix to replace a few of our databases, and I've
>>> been doing some tests on that direction. So far it's been working all right
>>> but I wanted to share it with you to see if I can get some recommendations
>>> from other experiences.
>>>
>>> Our dataset has 1 big table (around 200G) and around 100k smaller tables
>>> (the biggest is 5-6G, but 90% are less than 1G), the application runs
>>> mainly joins on one or two of this small tables and the big one to return
>>> just a few rows back to the app. So far it's been working OK in a 4 nodes
>>> test cluster (64G of RAM in total)
>>>
>>> All the tables are created with SALT_BUCKETS=32,COMPRESSION='snappy'
>>>
>>>
>>>
>>> Is someone running a similiar setup? any tips on how much RAM shall I
>>> use?
>>>
>>>
>>>
>>
>

Reply via email to