We (The Last Pickle) wrote a blog post on scaling time series:
http://thelastpickle.com/blog/2017/08/02/time-series-data-modeling-massive-scale.html

Rather than an agent_type, you can use a application determined bucket, so
that agents with more data use more buckets.  That'll keep your partition
sizes under control.  The blog post goes into a bit of detail, I won't
rehash it all here.  It's a pretty standard solution to this problem.

On Tue, Feb 5, 2019 at 11:38 AM Bobbie Haynes <haynes30...@gmail.com> wrote:

> even if i try to create a agent_type it will be same issue again because
> agent_id and agent_type have same values...
>
> On Tue, Feb 5, 2019 at 11:36 AM Bobbie Haynes <haynes30...@gmail.com>
> wrote:
>
>> unfortunately i do not have different of agents(agent_type) .. i only
>> have agent_id which is also a UUID type.
>>
>> On Tue, Feb 5, 2019 at 11:34 AM Nitan Kainth <nitankai...@gmail.com>
>> wrote:
>>
>>> You could consider a sudo column like agent_type and make it a compound
>>> partition key. It will limit break your partition into smaller ones but you
>>> will have to query with agent_id and agent_type in that case.
>>>
>>> On Tue, Feb 5, 2019 at 12:59 PM Bobbie Haynes <haynes30...@gmail.com>
>>> wrote:
>>>
>>>> Hi Everyone,
>>>>                       Could you please help me in modeling my table
>>>> below.I'm stuck here. My Partition key is agent_id and clustering column is
>>>> rowid. Each agent can have a minimum of 1000 rows to 10M depends on how
>>>> busy the agent .I'm facing large partition issue for my busy agents.
>>>> I'm using SizeTieredCompaction here..The table has Writes/Reads (70/30
>>>> ratio) and have deletes also in the table by agentid.
>>>>
>>>>
>>>> CREATE TABLE IF NOT EXISTS XXX (
>>>>  agent_id UUID,
>>>>  row_id BIGINT,
>>>>  path TEXT,
>>>>  security_attributes TEXT,
>>>>  actor TEXT,
>>>>  PRIMARY KEY (agent_id,row_id)
>>>> )
>>>>
>>>

-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade

Reply via email to