Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10968 )
Change subject: schema: add is_deleted virtual column ...................................................................... Patch Set 9: (1 comment) This implementation is so straightfoward and the risks limited enough in scope that I'm frankly pretty happy with this approach for the time being. Here's why: > breaking change I think we can target this precisely enough (perhaps make it $$kudu.is_deleted$$ that the probability of breaking someone in real life is vanishingly small. > looking at 'is-deleted' in isolation, I think it would be better to have an > optional bitset that gets plumbed through which holds the deleted status. > [...] It's clear that in the long run incremental (CDC) scans should use a > sparse row representation I agree that a sparse row layout would be more optimal for CDC, but implementing a virtual column now doesn't preclude us from doing that in the future, and implementing an optimal solution isn't a priority for us right now. > last-updated-timestamp per-row. If I understand tablet internals correctly, > this can't ever be provided with per-row granularity, since we dispose of > per-row timestamps during compaction after the ancient history watermark. While it isn't possible to get that information today, if we wanted this feature in the future we could implement it by keeping created / modified information on a per-row basis in a separate cfile or perhaps in one of the indexes. It wouldn't necessarily be accurate for existing tables, but going forward from the time of upgrade the modified information could theoretically be maintained. http://gerrit.cloudera.org:8080/#/c/10968/7/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/10968/7/src/kudu/master/catalog_manager.cc@1302 PS7, Line 1302: if (Schema::is_column_name_reserved(col_name)) { > Yeah, Grant had the same feedback. I'm open to making that change; just wan I don't think it's unreasonable to reserve $$kudu.*$$ or $$kudu_*$$ because this is so easy to implement and meets all of the other requirements for the is_deleted feature except for namespacing. -- To view, visit http://gerrit.cloudera.org:8080/10968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56eb1d44ba8bfbd76d8bb794b8076b695782939e Gerrit-Change-Number: 10968 Gerrit-PatchSet: 9 Gerrit-Owner: Adar Dembo <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Wed, 25 Jul 2018 20:33:01 +0000 Gerrit-HasComments: Yes
