Thanks, Phil. 1- In my use-case, its probably okay to partition all the org data together. This is for a b2b enterprise SaaS application, the customers will be organizations.
So it is probably okay to store each org's data next to each other, right? 2- I'm thinking of having the primary key be: (org_id, team_id, project_id, issue_id). In the above case, will there be a skinny row per issue, or a wide row per org / team / project? 3- Just to double check, with the above primary key, can I still query using just the org_id, org + team id, and org + team + project id? 4- If I wanted to refer to a particular issue, it looks like I'd need to send all 4 parameters. That may be problematic. Is there a better way of modeling this data? On Thu, Oct 6, 2016 at 9:30 PM, Philip Persad <[email protected]> wrote: > > > 1) No. Your first 3 queries will work but not the last one (get issue by > id). In Cassandra when you query you must include every preceding portion > of the primary key. > > 2) 64 bytes (16 * 4), or somewhat more if storing as strings? I don't > think that's something I'd worry too much about. > > 3) Depends on how you build your partition key. If partition key is (org > id), then you get one partition per org (probably bad depending on your > dataset). If partition key is (org id, team id, project id) then you will > have one partition per project which is probably fine ( again, depending on > your dataset). > > Cheers, > > -Phil > ------------------------------ > From: Ali Akhtar <[email protected]> > Sent: 2016-10-06 9:04 AM > To: [email protected] > Subject: Partition Key - Wide rows? > > Heya, > > I'm designing some tables, where data needs to be stored in the following > hierarchy: > > Organization -> Team -> Project -> Issues > > I need to be able to retrieve issues: > > - For the whole org - using org id > - For a team (org id + team id) > - For a project (org id + team id + project id) > - If possible, by using just the issue id > > I'm considering using all 4 ids as the primary key. The first 3 will use > UUIDs, except issue id which will be an alphanumeric string, unique per > project. > > 1) Will this setup allow using all 4 query scenarios? > 2) Will this make the primary key really long, 3 UUIDs + similar length'd > issue id? > 3) Will this store issues as skinny rows, or wide rows? If an org has a > lot of teams, which have a lot of projects, which have a lot of issues, > etc, could I have issues w/ running out of the column limit of wide rows? > 4) Is there a better way of achieving this scenario? > > > > >
