There is a German saying: Sometimes you don't see the woods because of the lots of trees.
Am 05.03.2017 09:25 schrieb "DuyHai Doan" <doanduy...@gmail.com>: > No problem, distributed systems are hard to reason about, I got caught many > times in the past > > On Sun, Mar 5, 2017 at 9:23 AM, benjamin roth <brs...@gmail.com> wrote: > > > Sorry. Answer was to fast. Maybe you are right. > > > > Am 05.03.2017 09:21 schrieb "benjamin roth" <brs...@gmail.com>: > > > > > No. You just change the partitioner. That's all > > > > > > Am 05.03.2017 09:15 schrieb "DuyHai Doan" <doanduy...@gmail.com>: > > > > > >> "How can that be achieved? I haven't done "scientific researches" yet > > but > > >> I > > >> guess a "MV partitioner" could do the trick. Instead of applying the > > >> regular partitioner, an MV partitioner would calculate the PK of the > > base > > >> table (which is always possible) and then apply the regular > > partitioner." > > >> > > >> The main purpose of MV is to avoid the drawbacks of 2nd index > > >> architecture, > > >> e.g. to scan a lot of nodes to fetch the results. > > >> > > >> With MV, since you give the partition key, the guarantee is that > you'll > > >> hit > > >> a single node. > > >> > > >> Now if you put MV data on the same node as base table data, you're > doing > > >> more-or-less the same thing as 2nd index. > > >> > > >> Let's take a dead simple example > > >> > > >> CREATE TABLE user (user_id uuid PRIMARY KEY, email text); > > >> CREATE MV user_by_email AS SELECT * FROM user WHERE user_id IS NOT > NULL > > >> AND > > >> email IS NOT NULL PRIMARY KEY((email),user_id); > > >> > > >> SELECT * FROM user_by_email WHERE email = xxx; > > >> > > >> With this query, how can you find the user_id that corresponds to > email > > >> 'xxx' so that your MV partitioner idea can work ? > > >> > > >> > > >> > > >> On Sun, Mar 5, 2017 at 9:05 AM, benjamin roth <brs...@gmail.com> > wrote: > > >> > > >> > While I was reading the MV paragraph in your post, an idea popped > up: > > >> > > > >> > The problem with MV inconsistencies and inconsistent range movement > is > > >> that > > >> > the "MV contract" is broken. This only happens because base data and > > >> > replica data reside on different hosts. If base data + replicas > would > > >> stay > > >> > on the same host then a rebuild/remove would always stream both > > matching > > >> > parts of a base table + mv. > > >> > > > >> > So my idea: > > >> > Why not make a replica ALWAYS stay local regardless where the token > of > > >> a MV > > >> > would point at. That would solve these problems: > > >> > 1. Rebuild / remove node would not break MV contract > > >> > 2. A write always stays local: > > >> > > > >> > a) That means replication happens sync. That means a quorum write to > > the > > >> > base table guarantees instant data availability with quorum read on > a > > >> view > > >> > > > >> > b) It saves network roundtrips + request/response handling and helps > > to > > >> > keep a cluster healthier in case of bulk operations (like repair > > >> streams or > > >> > rebuild stream). Write load stays local and is not spread across the > > >> whole > > >> > cluster. I think it makes the load in these situations more > > predictable. > > >> > > > >> > How can that be achieved? I haven't done "scientific researches" yet > > >> but I > > >> > guess a "MV partitioner" could do the trick. Instead of applying the > > >> > regular partitioner, an MV partitioner would calculate the PK of the > > >> base > > >> > table (which is always possible) and then apply the regular > > partitioner. > > >> > > > >> > I'll create a proper Jira for it on monday. Currently it's sunday > here > > >> and > > >> > my family wants me back so just a few thoughts on this right now. > > >> > > > >> > Any feedback is appreciated! > > >> > > > >> > 2017-03-05 6:34 GMT+01:00 Edward Capriolo <edlinuxg...@gmail.com>: > > >> > > > >> > > On Sat, Mar 4, 2017 at 10:26 AM, Jeff Jirsa <jji...@gmail.com> > > wrote: > > >> > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Mar 4, 2017, at 7:06 AM, Edward Capriolo < > > >> edlinuxg...@gmail.com> > > >> > > > wrote: > > >> > > > > > > >> > > > >> On Fri, Mar 3, 2017 at 12:04 PM, Jeff Jirsa < > jji...@gmail.com> > > >> > wrote: > > >> > > > >> > > >> > > > >> On Fri, Mar 3, 2017 at 5:40 AM, Edward Capriolo < > > >> > > edlinuxg...@gmail.com> > > >> > > > >> wrote: > > >> > > > >> > > >> > > > >>> > > >> > > > >>> I used them. I built do it yourself secondary indexes with > > them. > > >> > They > > >> > > > >> have > > >> > > > >>> there gotchas, but so do all the secondary index > > >> implementations. > > >> > > Just > > >> > > > >>> because datastax does not write about something. Lets see > > like 5 > > >> > > years > > >> > > > >> ago > > >> > > > >>> there was this: https://github.com/hmsonline/ > > cassandra-triggers > > >> > > > >>> > > >> > > > >>> > > >> > > > >> Still in use? How'd it work? Production ready? Would you > still > > >> do it > > >> > > > that > > >> > > > >> way in 2017? > > >> > > > >> > > >> > > > >> > > >> > > > >>> There is a fairly large divergence to what actual users do > and > > >> what > > >> > > > other > > >> > > > >>> groups 'say' actual users do in some cases. > > >> > > > >>> > > >> > > > >> > > >> > > > >> A lot of people don't share what they're doing (for business > > >> > reasons, > > >> > > or > > >> > > > >> because they don't think it's important, or because they > don't > > >> know > > >> > > > >> how/where), and that's fine but it makes it hard for anyone > to > > >> know > > >> > > what > > >> > > > >> features are used, or how well they're really working in > > >> production. > > >> > > > >> > > >> > > > >> I've seen a handful of "how do we use triggers" questions in > > IRC, > > >> > and > > >> > > > they > > >> > > > >> weren't unreasonable questions, but seemed like a lot of > pain, > > >> and > > >> > > more > > >> > > > >> than one of those people ultimately came back and said they > > used > > >> > some > > >> > > > other > > >> > > > >> mechanism (and of course, some of them silently disappear, so > > we > > >> > have > > >> > > no > > >> > > > >> idea if it worked or not). > > >> > > > >> > > >> > > > >> If anyone's actively using triggers, please don't keep it a > > >> secret. > > >> > > > Knowing > > >> > > > >> that they're being used would be a great way to justify > > >> continuing > > >> > to > > >> > > > >> maintain them. > > >> > > > >> > > >> > > > >> - Jeff > > >> > > > >> > > >> > > > > > > >> > > > > "Still in use? How'd it work? Production ready? Would you > still > > >> do it > > >> > > > that way in 2017?" > > >> > > > > > > >> > > > > I mean that is a loaded question. How long has cassandra had > > >> > Secondary > > >> > > > > Indexes? Did they work well? Would you use them? How many > times > > >> were > > >> > > > they re-written? > > >> > > > > > >> > > > It wasn't really meant to be a loaded question; I was being > > sincere > > >> > > > > > >> > > > But I'll answer: secondary indexes suck for many use cases, but > > >> they're > > >> > > > invaluable for their actual intended purpose, and I have no idea > > how > > >> > many > > >> > > > times they've been rewritten but they're production ready for > > their > > >> > > narrow > > >> > > > use case (defined by cardinality). > > >> > > > > > >> > > > Is there a real triggers use case still? Alternative to MVs? > > >> > Alternative > > >> > > > to CDC? I've never implemented triggers - since you have, what's > > the > > >> > > level > > >> > > > of surprise for the developer? > > >> > > > > >> > > > > >> > > :) You mention alternatives/: Lets break them down. > > >> > > > > >> > > MV: > > >> > > They seem to have a lot pf promise. IE you can use them for things > > >> other > > >> > > then equality searches, and I do think the CQL example with the > top > > N > > >> > high > > >> > > scores is pretty useful. Then again our buddy Mr Roth has a thread > > >> named > > >> > > "Rebuild / remove node with MV is inconsistent". I actually think > a > > >> lot > > >> > of > > >> > > the use case for mv falls into the category of "something you > should > > >> > > actually be doing with storm". I can vibe with the concept of not > > >> > needing a > > >> > > streaming platform, but i KNOW storm would do this correctly. I > > don't > > >> > want > > >> > > to land on something like 2x index v1 v2 where there was > fundamental > > >> > flaws > > >> > > at scale.(not saying this is case but the rebuild thing seems a > bit > > >> > scary) > > >> > > > > >> > > CDC: > > >> > > I slightly afraid of this. Rational: A extensible piece design > > >> > specifically > > >> > > for a close source implementation of hub and spoke replication. I > > have > > >> > some > > >> > > experience trying to "play along" with extensible things > > >> > > https://issues.apache.org/jira/browse/CASSANDRA-12627 > > >> > > "Thus, I'm -1 on {[PropertyOrEnvironmentSeedProvider}}." > > >> > > > > >> > > Not a rub, but I can't even get something committed using an > > existing > > >> > > extensible interface. Heaven forbid a use case I have would want > to > > >> > > *change* > > >> > > the interface, I would probably get a -12. So I have no desire to > > try > > >> and > > >> > > maintain a CDC implementation. I see myself falling into the same > > old > > >> > "why > > >> > > you want to do this? -1" trap. > > >> > > > > >> > > Coordinator Triggers: > > >> > > To bring things back really old-school coordinator triggers > everyone > > >> > always > > >> > > wanted. In a nutshell, I DO believe they are easier to reason > about > > >> then > > >> > > MV. It is pretty basic, it happens on the coordinator there is no > > >> > batchlogs > > >> > > or whatever, best effort possibly requiring more nodes then as the > > >> keys > > >> > > might be on different services. Actually I tend do like features > > like. > > >> > Once > > >> > > something comes on the downswing of "software hype cycle" you > know > > >> it is > > >> > > pretty stable as everyone's all excited about other things. > > >> > > > > >> > > As I said, I know I can use storm for top-n, so what is this > > feature? > > >> > Well > > >> > > I want to optimize my network transfer generally by building my > > batch > > >> > > mutations on the server. Seems reasonable. Maybe I want to have my > > own > > >> > > little "read before write" thing like CQL lists. > > >> > > > > >> > > The warts, having tried it. First time i tried it found it did not > > >> work > > >> > > with non batches, patched in 3 hours. Took weeks before some CQL > > user > > >> had > > >> > > the same problem and it got fixed :) There was no dynamic stuff at > > the > > >> > time > > >> > > so it was BYO class loader. Going against the grain and saying. > > >> > > > > >> > > The thing you have to realize with the best effort coordinator > > >> triggers > > >> > are > > >> > > that "transaction" could be incomplete and well that sucks maybe > for > > >> some > > >> > > cases. But I actually felt the 2x index implementations force all > > >> > problems > > >> > > into a type of "foreign key transnational integrity " that does > not > > >> make > > >> > > sense for cassandra. > > >> > > > > >> > > Have you every used elastic search, there version of consistency > is > > >> write > > >> > > something, keep reading and eventually you see it, wildly popular > :) > > >> It > > >> > is > > >> > > a crazy world. > > >> > > > > >> > > > >> > > > > > >