[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#10). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Link to the version with images: https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-09-25-index-skip-scan-optimization-in-kudu.md Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-09-25-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/example-table.png A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 4 files changed, 114 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/10 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 10 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 9: Sorry for the confusion. I have renamed the file name in the github version to make it consistent with the last updated change. The updated file name is '2018-09-25-index-skip-scan-optimization-in-kudu.md' -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 9 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Mon, 24 Sep 2018 21:40:25 + Gerrit-HasComments: No
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 8: (8 comments) http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@10 PS8, Line 10: > nit: I would add something along these lines here to help transition to the Done http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@36 PS8, Line 36: contains > nit: "only contains the" Done http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74 PS8, Line 74: http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D > that would work too, yeah Done http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@76 PS8, Line 76: decide > s/decide/have tentatively chosen/ Done http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77 PS8, Line 77: http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D > maybe, simple text representation would fit as well: Done http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@78 PS8, Line 78: take > s/take/project/ Done http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@88 PS8, Line 88: current implementation > s/current implementation/implementation in the patch/ Done http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@109 PS8, Line 109: [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): Index Skip Scanning - Oracle Database : : [[3]](https://www.sqlite.org/optoverview.html#skipscan): Skip Scan - SQLite > I see it now; feel free to ignore this. Done -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 8 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Sun, 23 Sep 2018 14:25:00 + Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#9). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Link to the version with images: https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-09-25-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/example-table.png A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 4 files changed, 114 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/9 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 9 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#8). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Link to the version with images: https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/example-table.png A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 4 files changed, 113 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/8 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 8 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 8: (7 comments) http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@7 PS7, Line 7: team > nit: lower-case "team" Done http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@27 PS7, Line 27: table > nit: lower-case "table" Done http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45 PS7, Line 45: . We will refer to it as the : "prefix column" and its specific value as the "prefix k > nit: drop the parens and start a new sentence instead. Done http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47 PS7, Line 47: > nit: remove comma, maybe replace with "that" Done http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@82 PS7, Line 82: : Conclusion : == > No where does this mention that, as implemented, this works for equality pr Done http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98 PS7, Line 98: roughly enjo > nit: full-fledged Done http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@99 PS7, Line 99: right from underst > nit: "of the skip scan approach" or "of the skip scan optimization" Done -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 8 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Thu, 13 Sep 2018 22:16:44 + Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 7: (13 comments) Many thanks for the comments. Please take a look. http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39 PS6, Line 39: table > nit: tablet, here and elsewhere Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45 PS6, Line 45: (we will refer to it as : "prefix column" and its specific value as "prefix key"). > nit: since we're not using these as a variable names, but rather as definit Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@48 PS6, Line 48: Therefore, we can use the index to skip to the rows that have distinct prefix keys, : and also satisfy the predicate on the `tstamp` column. > nit: maybe drop the ** around "skip" here, since you do it down below anywa Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@58 PS6, Line 58: `host = helium` > nit: would be nice if the entire thing were in backticks, since it's a cond You are right. Made the change. http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@60 PS6, Line 60: satisfy the predicate, and we > nit: probably not needed Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@59 PS6, Line 59: . At that : point we would > nit: reword "until the predicate no longer matches. At that point we would Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@67 PS6, Line 67: tio > nit: this is a little distracting, below too. Let's just keep it singular s Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73 PS6, Line 73: o get worse with respect to the full tablet scan performance when the prefix column cardinality > This and below are being rendered weirdly by github. Would like to confirm Yes, it works fine with jekyll. (Link to this screen shot - https://raw.githubusercontent.com/AnupamaGupta01/kudu-1/gh-pages-staging/img/index-skip-scan/equation.png) http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74 PS6, Line 74: C%20tablet%20%7D). : Therefore, in order to use skip scan perf > "consistent performance in cases of large prefix column cardinality" Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@93 PS6, Line 93: > nit: probably not needed, below too. Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@94 PS6, Line 94: Range pr > nit: In-list Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98 PS6, Line 98: orki > nit: team Done http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@101 PS6, Line 101: : References : == : : [[1]](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42851.pdf): Gupta, Ashish, et al. "Mesa: : Geo-replicated, near real-time, scalable data warehousing." Proceedings of the VLDB Endowment 7.12 (2014): 1259-1270. : : [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): Index Skip Scanning - Oracle Database : > It's really up to you, but WDYT about just linking these in-line? This is a Thanks, I see your point. I think that the current section for references looks fine after incorporating Alexey's suggestions on the same (in Patch 4, L62). -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 7 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Thu, 13 Sep 2018 03:35:44 + Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#7). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Link to the version with images: https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/example-table.png A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 4 files changed, 112 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/7 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 7 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Mike Percy
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG@6 PS4, Line 6: : Blogpost describing index skip scan optimization. : > Thanks for this Andrew. I am still not sure why images are not getting rend Resolved this issue now. Added a link to the rendered version. -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Sun, 09 Sep 2018 20:13:26 + Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 6: (17 comments) http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73 PS4, Line 73: exceeds ![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D). : Therefore, in order to use skip scan performance benefits when possible and maintain a consistent performance with > I think it's the number of rows in the CFileSet, which I think is also the You are right ! How about rewording this to "rows in tablet" ? http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9 PS5, Line 9: index skip scan (a.k. > It's great that you found another reference to the same idea in the google' Sounds good. Done. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@13 PS5, Line 13: Let's b > nit: probably don't need this Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40 PS5, Line 40: an option > nit: probably don't need this Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@38 PS5, Line 38: : Instead, a full table scan is done by default. Other databases may optimize such scans by building secondary indexes : (though it might be redundant to build one on one of the > Let's stick with a single concrete example, say `tstamp`. Then we can point Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@41 PS5, Line 41: given its lack of secondary index support. : : The question is, can Kudu do better than a full table scan here? : > nit: I think this would read better after L45. E.g. Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47 PS5, Line 47: in the in > nit: since this is a concrete example, we know there is only one column bef Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@50 PS5, Line 50: : {% highlight SQL %} > nit: reword as "to **skip** to the rows that have distinct prefix keys, and Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61 PS5, Line 61: ce, this metho > nit: "Kudu tablet" or "tablet server" or "Kudu" Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61 PS5, Line 61: as **skip scan optimization**[2-3]. : : Performance > Maybe reverse the order of **skip** and **scan**, since the name is "skip s Done. You are correct, I have rephrased the sentence to better clarify this point. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@70 PS5, Line 70: > nit: add "the" in front of "Lower" and "better" Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@71 PS5, Line 71: nts, on up to 10 million rows per t > I seem to recall a plot that showed the performance without the dynamic dis That's a good point. Unfortunately, I do not have the backup of that slide. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77 PS5, Line 77: > nit: skips? for consistency with the "skip" and "scan" terminology Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77 PS5, Line 77: It will be an in > nit: I think it's clear enough that this may refer to multiple, so maybe ju Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@89 PS5, Line 89: > nit: one (`host`) Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@104 PS5, Line 104: [[1 > Do you feel good about adding one more reference? I think https://www.sqli Thanks so much for this suggestion. I have added this reference too. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@105 PS5, Line 105: Geo- > nit: usually in the reference section they use '[x]' where it's possible to Thank you for pointing this out. Done. -- To view, visit
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, Mike Percy, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#6). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Link to the version with images: https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/example-table.png A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 4 files changed, 111 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/6 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Mike Percy
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 5: (22 comments) Please take a look. http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG@6 PS4, Line 6: : Blogpost describing index skip scan optimization. : > In reviewing blogposts, it's generally helpful to post a link to a rendered Thanks for this Andrew. I am still not sure why images are not getting rendered here -https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md. Although I do see the rendered version locally, using jekyll. Please let me know if I am missing something. http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9 PS4, Line 9: [index skip scan][1]. > This already seems like it's going a bit too far into implementation detail Got it. Done. http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@11 PS4, Line 11: : > Probably don't need this. Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@14 PS4, Line 14: > What do these do? This is used to show the beginning excerpts of the post. I misunderstood earlier that it is used for newline. Moved this tag after the beginning two lines and removed this from elsewhere. http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@31 PS4, Line 31: > Maybe, add a reference (like https://en.wikipedia.org/wiki/B-tree) in-line Makes sense. Added an in-line reference. http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@31 PS4, Line 31: > nit: "a B-tree", and no need to capitalize "Tree" below Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@33 PS4, Line 33: `metric > nit: perhaps using ``s would be more reasonable here (ie. `host`). Then it' Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@32 PS4, Line 32: In this case, by default, Kudu internally builds a primary key index (implemented as a : [B-tree](https://en.wikipedia.org/wiki/B-tree)) for the table `metrics`. : As shown in the table above, the ind > IMO this doesn't convey the idea that the data is sorted by the composite o Thanks for this suggestion. I moved the example dataset from below and used it as a reference to elaborate on the points you mentioned. http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@32 PS4, Line 32: In this case, by default, Kudu internally builds a primary key index (implemented as a : [B-tree](https://en.wikipedia.org/wiki/B-tree)) for the table `metrics`. : As shown in the table above, the ind > +1 all points mentioned by Andrew here. Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39 PS4, Line 39: ? > nit: here and elsewhere, no need for spaces before punctuation marks Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40 PS4, Line 40: the current query execution plan does not use the index. Instead, a full tab > I'm not sure this gives a clear explanation as for the reason to perform a Rephrased this paragraph to clarify this point. http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45 PS4, Line 45: The question is, > In general, I think the index skip scan optimization is not the only answer Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@46 PS4, Line 46: > The crux of this is the prefixes are also sorted, and all rows of a given p Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@53 PS4, Line 53: For example, consider the query: > nit: maybe, to be in sync with the CREATE TABLE statement above, write SQL Done http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@51 PS4, Line 51: and satisfying the query predicate on the `tstamp` column. : : For example, consider the query: : {% highlight SQL %} : SELECT clusterid FROM metrics WHERE tstamp = 100; : {% endhighlight %} : > Ah, so you _do_ have an example! I
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, Mike Percy, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#5). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/example-table.png A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 4 files changed, 106 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/5 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 5 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Mike Percy
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#4). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 3 files changed, 95 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/4 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 3 files changed, 64 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/3 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 3 Gerrit-Owner: Anupama Gupta
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#17). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,261 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/17 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 17 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] Efficiently support predicates on non-prefix key components
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11252 to look at the new patch set (#5). Change subject: Efficiently support predicates on non-prefix key components .. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,261 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/11252/5 -- To view, visit http://gerrit.cloudera.org:8080/11252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 Gerrit-Change-Number: 11252 Gerrit-PatchSet: 5 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 15: (39 comments) http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/cfile/cfile_reader.cc@792 PS14, Line 792: // Currently this block holds encoded composite key. : Arena arena(16*1024); : : DCHECK(!prepared_blocks_.empty()); : > nit: As a general rule of thumb, if something is indicative of a programmer Thanks for this pointer. I have removed the DCHECK() in this case. http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/cfile/cfile_reader.cc@797 PS14, Line 797: us > nit: I know some of the older code has this reversed, but for new code, we Done http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/cfile/cfile_reader.cc@871 PS14, Line 871: last_prepare_idx_ = b->first_row_idx() + b->dblk_->GetCurrentIndex(); : last_prepare_count_ = 0; : : p > Not sure if this is necessary to avoid breaking encapsulation, as I think w ACK. http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/cfile/cfile_reader.cc@871 PS14, Line 871: last_prepare_idx_ = b->first_row_idx() + b->dblk_->GetCurrentIndex(); : last_prepare_count_ = 0; : : p > It's worth documenting in the header the side effect of `cache_seeked_value If `cache_seeked_value` is true, then a call to StoreCurrentValue() will not seek an extra row. Instead, due to the side effect of using CopyNextValues(size_t *n, ColumnDataView *dst) the first value from the last prepared PK column block will be fetched into 'dst'. I have now documented this in the header file. http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/cfile/cfile_reader.cc@790 PS15, Line 790: Todo > nit: s/Todo/TODO/ Done http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/cfile/cfile_reader.cc@793 PS15, Line 793: 16*1024 > Why did we go from using a constant back to using a magic number? http://wi Made this a member variable of CFileIterator as a followup of the review comment in cfile_set.cc (L553). After discussing with Alexey, we came to the conclusion that it is safe to initialize Arena with an initial buffer size of 1M. If required, the Arena size grows wrt the data being allocated to it. http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/cfile/cfile_reader.cc@795 PS15, Line 795: DCHECK(!prepared_blocks_.empty()); : if (prepared_blocks_.empty()) { : return Status::NotFound("blocks not found"); : } > This has a code smell. https://martinfowler.com/bliki/CodeSmell.html In deb Yes it is possible. I have removed DCHECK and handled this condition by returning Status::NotFound. http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/common/encoded_key.cc File src/kudu/common/encoded_key.cc: http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/common/encoded_key.cc@92 PS14, Line 92:gscoped_ptr *key, :Arena* arena, int32_t num_columns > Per the GSG (https://google.github.io/styleguide/cppguide.html): Done http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/tablet/cfile_set.h@175 PS15, Line 175: // Due to the caveat(see below) listed for SkipToNextScan(size_t *remaining), : // this class should not reference a separate "client" class that uses key_iter->cur_val. > I find this part of the comment confusing and I think we should just remove Done http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/tablet/cfile_set.h@248 PS15, Line 248: Caveat: > I don't think we need to specifically call this out as a caveat Done http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/tablet/cfile_set.h@249 PS15, Line 249: key_iter->cur_val_ > How about just key_iter_? Is that sufficient? Talking about a private membe Ok. Makes sense. Changed this to just key->iter_. http://gerrit.cloudera.org:8080/#/c/10983/15/src/kudu/tablet/cfile_set.h@362 PS15, Line 362: skip_scan_num_seeks_cutoff > missing underscore suffix on member variable Done http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: PS14: > nit: s/unique prefix/distinct prefix Done http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.h@174 PS14, Line 174: // : // Due to the caveat(see below) listed for SkipToNextScan(size_t
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#16). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,261 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/16 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 16 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] Efficiently support predicates on non-prefix key components
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11252 to look at the new patch set (#4). Change subject: Efficiently support predicates on non-prefix key components .. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,261 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/11252/4 -- To view, visit http://gerrit.cloudera.org:8080/11252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 Gerrit-Change-Number: 11252 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] Efficiently support predicates on non-prefix key components
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11252 to look at the new patch set (#3). Change subject: Efficiently support predicates on non-prefix key components .. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,256 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/11252/3 -- To view, visit http://gerrit.cloudera.org:8080/11252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 Gerrit-Change-Number: 11252 Gerrit-PatchSet: 3 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] Efficiently support predicates on non-prefix key components
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11252 to look at the new patch set (#2). Change subject: Efficiently support predicates on non-prefix key components .. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,256 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/11252/2 -- To view, visit http://gerrit.cloudera.org:8080/11252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 Gerrit-Change-Number: 11252 Gerrit-PatchSet: 2 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] Efficiently support predicates on non-prefix key components
Anupama Gupta has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11252 Change subject: Efficiently support predicates on non-prefix key components .. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,248 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/11252/1 -- To view, visit http://gerrit.cloudera.org:8080/11252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If9f5a4f2e9da39211d645b4d6d163ecdf919e5d2 Gerrit-Change-Number: 11252 Gerrit-PatchSet: 1 Gerrit-Owner: Anupama Gupta
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 15: (7 comments) I have addressed all the open comments. PTAL. http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@59 PS10, Line 59: > Yep, it would be nice to have a test with super-wide primary key (multiple Added a test case where the encoded primary key size is equal to the max allowable encoded PK size (16MB), stored as max_encoded_key_size_bytes. Also, verified that any insertions that violate this size constraint results in failure. http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@444 PS12, Line 444: col_id = > If that's something we expect to happen, I would simply disable the skip sc Done http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@462 PS12, Line 462: } : : if (nonfirst_key_pred_exists) { : use_skip_scan_ = true; : : // Store the predicate column id. : skip_scan_predicate_column_id_ = min_col_id; : : > I think it's better to keep it as Andrew requested. After some considerati Sure. Done. http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.cc@414 PS14, Line 414: id CFileSet::Ite > nit: since this is not to change in the code below, add 'const' Done http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.cc@576 PS14, Line 576: > style nit: keep the brace with the name of the method Done http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.cc@699 PS14, Line 699: : skip_scan_num_seeks_++ > I think it makes sense to introduce a metric to capture cases when skip sca Added this info in VLOG(1). http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/index_skipscan-test.cc File src/kudu/tablet/index_skipscan-test.cc: http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/index_skipscan-test.cc@189 PS14, Line 189: break; : } : case kMaxSizedEncodedKey: { : CHECK_OK(builder.AddKeyColumn("P1", STRING)); : CHECK_OK(builder.AddKeyColumn > nit: all there might be constexpr: Done -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 15 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Wed, 15 Aug 2018 04:55:52 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#15). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,069 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/15 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 15 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has abandoned this change. ( http://gerrit.cloudera.org:8080/11067 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Abandoned This is an obsolete branch. -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 28 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 14: (25 comments) Please review the changes. http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.h@345 PS12, Line 345: bool cache_seeked_value > nit: Even though this is calling StoreCurrentValue(), I think calling this Done http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.h@348 PS12, Line 348: const std::string& GetCurrentValue() const; > Add 'const' modifier for this method. Done http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.h@472 PS12, Line 472: // Value currently pointed to by validx_iter_. : std::string cur_val_; : > +1 Andrew, in case we have a single public function GetCurrentValue(), it would actually lead to different results because it uses CopyNextValues(size_t *n, ColumnDataView *dst) . CopyNextValues(..) fetches the next 'n' values from the block into 'dst' . Hence, the need to store this fetched value in 'cur_val_' . http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@473 PS10, Line 473: std::string cur_val_; : > This breaks encapsulation. We should document something along these lines i Listed this caveat in CFileSet::Iterator . Please let me know if it fine. http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/cfile/cfile_reader.cc@792 PS12, Line 792: DCHECK(!prepared_blocks_.empty()); > What happens if this does not hold in release build? Makes sense. Added return non-OK status (Status::NotFound) in this case. http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.h@213 PS12, Line 213: // prefix key refers to the first "num_prefix_cols" columns of the current key. : // current key is > nit: Reword as "If `cache_seeked_value` is true, the predicate column itera Rephrased as: "If 'cache_seeked_value' is true, the validx_iter_ will store the value seeked to." , since the iterator is a validx_iter_ . http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.h@220 PS12, Line 220: Status SeekToRowWithCurPrefixMatchingPred(const gscoped_ptr& enc_key); : > nit: reword as "Build the key with the same prefix as `cur_enc_key`, that h Done http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@19 PS12, Line 19: #include nit: please use '#include ' instead and place that among other C++-s Done http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@76 PS12, Line 76: > Per our discussion with Mike today, I thought the idea was to have this dis You are right. I assumed this will be disabled right before merging (in the final patch). Set this to 'false' now, as now we are testing both the scenarios in index_skipscan_test.cc http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@435 PS12, Line 435: > nit: you could use 'auto' here. Removed predicates var as per comment on L436. http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@436 PS12, Line 436: spec.predi > nit: this can just be `spec.predicates()`, then there would be no need for Done http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@440 PS12, Line 440: // Get the column id from the predicate > nit: move this out of the loop, maybe up by L413 and then use it in definin Done http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@444 PS12, Line 444: col_id == > Could find_column() return -1 here? Yes, if the column is not found in the schema. Not quite sure whether to continue the loop in this case or just return with skip scan disabled. Any suggestion ? http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@462 PS12, Line 462: : // Store the predicate column id. : skip_scan_predicate_column_id_ = min_col_id; : : // Store the predicate value. : skip_scan_predicate_value_ = pred_value; : : // Store the cutoff on the number of skip scan seeks. : > It was originally written as described, but I asked her to make the change, Alexey, please let
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#14). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,023 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/14 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 14 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has removed Alexey Serbin from this change. ( http://gerrit.cloudera.org:8080/11067 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Removed reviewer Alexey Serbin. -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 28 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#28). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,023 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/28 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 28 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has removed Mike Percy from this change. ( http://gerrit.cloudera.org:8080/11067 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Removed reviewer Mike Percy. -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 27 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#27). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,023 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/27 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 27 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#13). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,018 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/13 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 13 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#26). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,018 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/26 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 26 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#25). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,016 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/25 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 25 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#12). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,017 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/12 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 12 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#24). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,017 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/24 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 24 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@470 PS10, Line 470: Status StoreCurrentValue(); : > super-nit: in reading the name SetX(), based on other usages in the codebas Done http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@353 PS10, Line 353: // Whether the skip scan optimization has searched the current prefix for a predicate match : // or whether the prefix has changed since its l > How about change this to: Done -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 11 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Tue, 07 Aug 2018 23:25:16 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 11: (41 comments) http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@59 PS10, Line 59: static const int kEncodedCompositeKeyMaxSize = > Great question. This is from http://kudu.apache.org/docs/known_issues.html# Refactored this name to: kEncodedCompositeKeyMaxSize as it makes more sense wrt our skip scan use case. http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@339 PS10, Line 339: to store the value obtaine > nit: note that the "current value" refers to the value after seeking. Done http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/encoded_key.h File src/kudu/common/encoded_key.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/encoded_key.h@63 PS10, Line 63: ; > nit: what does this do? Created another function IncrementEncodedKeyColumns(..) for better separation of responsibility and documented this parameter. http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/partial_row.h File src/kudu/common/partial_row.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/partial_row.h@52 PS10, Line 52: mplate struct SliceTypeRowOps; // IWYU pragma: keep : template struct NumTypeRowOps; // IWYU pragma: keep > nit: not your fault, but mind removing the spaces here? Done http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@207 PS10, Line 207: gscoped_ptr > nit: if the change would be contained within cfile_set.cc, mind updating th Noted. Sticking with gscoped_ptr for now since it's used outside cfile_set.cc (in encode_key.cc). http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@209 PS10, Line 209: refix key. > nit: Since this isn't a variable name, maybe take out the underscore. Alright. Changed here and elsewhere. http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@210 PS10, Line 210: um_prefix_ > nit: 'num_prefix_cols' Done http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@210 PS10, Line 210: // prefix key refers to the first "num_prefix_cols" columns of the current key. : // current key is the key currently pointed to by validx_iter_ (prior to calling SeekToNextPrefixKey()). : // 'cache_seeked_value' controls whether we will remember the next key seeked to in : // this function. > nit: can you reword this to clarify whether "current" refers to before or a Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@76 PS9, Line 76: true > Let's default this to false for the initial merge and do some testing to en Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@80 PS9, Line 80: "Max number of skip attempts the skip scan optimization should make before " > Add: Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@405 PS9, Line 405: e > nit: extra space Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@424 PS9, Line 424: > nit: add spaces around your operators on this line, both the assignment = a Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@430 PS9, Line 430: key_cols; > nit: move this comment inside the 'if' to right below this line Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@436 PS9, Line 436: nst auto& co > nit: unnecessary parentheses Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@437 PS9, Line 437: > nit: put this comment on its own line Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@444 PS9, Line 444: p); > nit: unnecessary parentheses Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@522 PS9, Line 522: << KUD > nit: remove so much extra horizontal white space on this line Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@531 PS9, Line 531: " > s/or/which we consider to be/ Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@532 PS9, Line 532: > currently-seeked key in the primary key index For better clarity, changed this to - "cached value obtained after the previous seek in the primary key index". http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@576 PS9, Line 576: } > nit: just
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#11). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 1,016 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/11 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 11 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 10: (8 comments) Please review the changes. http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@260 PS9, Line 260: key_iter_ is not NULL. > this is no longer true (checked inside the method) Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@264 PS9, Line 264: //higher unique prefix) to scan. : // 2. 'remaining' stores the number the entries to be scanned in the current scan range > Is this an implementation detail or part of the interface of this method? I It is an implementation detail based on the assumption that currently after initializing CFileSet::Iterator, CFileSet::Iterator::SkipToNextScan(..) has exclusive access to this variable. Have added a line about this where this variable is defined. http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@266 PS9, Line 266: // : // See the .cc file for details on the approach and the implementation. : void SkipToNextScan(size_t *rema > These variables are local to this method, so I don't see why a caller would Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@269 PS9, Line 269: : // Collect the IO statistics for each of the underlying columns. > This seems to be one of the key postconditions from my perspective, since i Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@271 PS9, Line 271: l void Ge > nit: 'remaining' Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@639 PS9, Line 639: // 1. Search within the ad-hoc index for the next unique prefix : //(set of keys prior to the first predicate column). : //Searching is done using validx_iter_. : // 2. Read that unique prefix from the ad-hoc index, append the lower bound : //of the first predicate column, and seek to that. : //If this matches, this is the lower bound of our desired scan. : // 3. If we found our desired lower bou > How about using something a bit higher level to describe this algorithm: Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@671 PS9, Line 671: ek to the next prefix if our p > The values need to be inverted due to the variable name change; it doesn't Done http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@820 PS9, Line 820: tion( > nit: unexpected indentation Done -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 10 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Tue, 07 Aug 2018 00:45:14 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#10). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 977 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/10 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 10 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#23). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 977 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/23 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 23 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#22). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 978 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/22 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 22 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#21). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 979 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/21 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 21 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#20). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 979 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/20 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 20 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#9). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 979 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/9 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 9 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 7: (18 comments) Thanks for the comments. Also, added the condition for disabling skip scan on the fly. Please review the changes. http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.h@44 PS7, Line 44: class EncodedKey; > move this down to line 55 Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@75 PS7, Line 75: return_from_skip_scan > instead of whether to enable this, this flag should specify how many loops Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@514 PS7, Line 514: seek_to_upper_bound_key > How about: Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@516 PS7, Line 516: 1024 > according to https://kudu.apache.org/docs/known_issues.html#_primary_keys I Added this in cfile_reader.h (this being a common file for both cfile_set.cc and cfile_reader.cc where this variable will be accessed). http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@518 PS7, Line 518: // Convert current key slice to encoded key. : RETURN_NOT_OK_PREPEND(EncodedKey::DecodeEncodedString( : base_data_->tablet_schema(), , : Slice(key_iter_->GetCurrentValue()), _key), "Invalid scan prefix key"); > This seems to be repeated in a couple of places. How about we factor this o Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@533 PS7, Line 533: // Set the predicate column with the predicate value. : // This is done to avoid overshooting the lower bound on the predicate value : // in the current scan range. : KuduPartialRow partial_row(&(base_data_->tablet_schema())); : gscoped_ptr key_with_pred_value = : GetKeyWithPredicateVal(_row, enc_key.Pass()); : : return key_iter_->SeekAtOrAfter(*key_with_pred_value, _match, /* set_current_value= */ true); > Like I noted above, I think this part should be factored into its own funct Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@579 PS7, Line 579: bool CFileSet::Iterator::CheckKeyMatch(const std::vector _keys, int start_col_id, int end_col_id) { : // Encode the key currently pointed to by validx_iter_. : Arena arena2(1024); : gscoped_ptr cur_enc_key; : EncodedKey::DecodeEncodedString( : base_data_->tablet_schema(), , : Slice(key_iter_->GetCurrentValue()), _enc_key); : : for (int col_id = start_col_id; col_id <= end_col_id; col_id++) { : if (base_data_->tablet_schema().column(col_id).Stringify(raw_keys[col_id]) != : base_data_->tablet_schema().column(col_id).Stringify(cur_enc_key->raw_keys()[col_id])) { : return false; : } : } : return true; : } > Instead of checking that the key matches, why not just check that the predi Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@596 PS7, Line 596: // This is a three seek approach for index skip scan implementation: : // 1. Place the validx_iter_ at the entry containing the next : //unique "prefix_key" value. : // 2. Place the iterator at or after the entry formed by the "prefix_key" : //found in 1. and the predicate value. : // 3. If the seek in 2. results in exact match with the predicate value, : //store the row id that represents the last relevant entry (upper bound) wrt the : //current "prefix_key"(found in 1.) > move this inside the function to line 605, since it describes what we are d Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@609 PS7, Line 609: bool lower_bound_key_found; : bool predicate_match = false; > can we merge these two variables into the single variable lower_bound_key_f Done http://gerrit.cloudera.org:8080/#/c/10983/7/src/kudu/tablet/cfile_set.cc@614 PS7, Line 614: // Whether to seek for the next unique prefix, this flag is false : // when the prefix key gets incremented while seeking for the lower bound : // entry in the loop. : bool continue_seeking_next_prefix = true; > I think it would be more descriptive of the
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#8). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 979 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/8 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 8 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#19). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 979 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/19 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 19 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#18). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 983 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/18 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 18 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#17). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 987 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/17 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 17 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#16). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 990 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/16 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 16 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. Patch Set 7: (21 comments) Thanks for the comments. Please review the changes. http://gerrit.cloudera.org:8080/#/c/10983/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10983/5//COMMIT_MSG@7 PS5, Line 7: KUDU-1291 > nit: s/Kudu-1291/KUDU-1291/ Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/cfile/cfile_reader.h@340 PS5, Line 340: o value index, > document this parameter in the header file comment Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.h File src/kudu/cfile/cfile_reader.h: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.h@343 PS4, Line 343:bool *exact_match, bool set_current_value = false); > nit: add const specifier for the method. Also, consider returning const re Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc@788 PS4, Line 788: > what if prepared_blocks_ is empty? If that the thing-that-cannot-be, add D Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@248 PS4, Line 248: r_bound_idx wi > style nit: here and elsewhere -- we tend to stick the asterisk and ampersan Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@399 PS4, Line 399: > If 'spec' instance is not modified here, why not to pass it as a const refe Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@414 PS4, Line 414: bool nonfirst_key_pred_exists = false; > Is copying necessary here? Why not to use const reference? Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@418 PS4, Line 418: > nit: I don't think the parentheses are necessary here Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@413 PS5, Line 413: > nit: add punctuation to all your comments (add a period at the end of the s Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@511 PS5, Line 511: > this kind of api doc should go into the header file, not the implementation Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@527 PS5, Line 527: > since we are not using this, can we just make this /* exact_match= */ nullp We cannot make it nullptr , because exact match is being de-referenced in functions called by SeekAtOrAfter. I have added a comment that exact_match value is not used after the current call. http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@565 PS5, Line 565: tig > s/Get/Return/ Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@568 PS5, Line 568: const ColumnSchema = cont_row.schema()->column(i); > remove or VLOG(2) Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@581 PS5, Line 581: rena2(1024); > nit: use more descriptive variable names here Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@588 PS5, Line 588: let_schema( > how about rename this to: SkipToNextScan() Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@602 PS5, Line 602: //store the row id that represents the last relevant entry (upper bound) wrt the > I'm not comfortable merging this patch without this feature Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@662 PS5, Line 662: > exclusive upper bound Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@668 PS5, Line 668: : bool exact > what guarantees the skip_scan_upper_bound_idx_ has been set to upper_bound_ This is a valid point. I have updated the skip_scan_upper_bound_idx_ value here now. http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@675 PS5, Line 675: ow > wording nit: Check to see whether we have seeked backwards. If so, we need Done http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@684 PS5, Line 684: // We didn't find an exact match for our lower bound, so re-seek. > can we convert this to a while-loop instead of a do-while-loop, now that th Done
[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#7). Change subject: KUDU-1291. Efficiently support predicates on non-prefix key components .. KUDU-1291. Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 920 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/7 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 7 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#6). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 920 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/6 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#15). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 920 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/15 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 15 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#14). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 925 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/14 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 14 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Patch Set 4: (22 comments) Please review the changes. http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc@785 PS4, Line 785: Slice ret; > add: DCHECK_EQ(nullptr, validx_iter_); Added a check for this. Actually no, I think there is no guarantee that value index will always be string. I have added a flag to invoke this function (set_current_value) now . I wonder if there is an alternative to deal with this. http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@204 PS4, Line 204: // This function is used to place the validx_iter_ at the next greater "prefix_key". > nit: you should try to maintain the same order in the header file as the .c Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@283 PS4, Line 283: skip_scan_enabled_ > rename to use_skip_scan_ because this is confusingly similar to the gflag n Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@286 PS4, Line 286: suffix_key_pred_value_ > rename to skip_scan_predicate_value_ Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@289 PS4, Line 289: suffix_key_column_id_ > rename to skip_scan_predicate_column_id_ Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@291 PS4, Line 291: // Row id of the last entry of "suffix_key" predicate value wrt to the : // "prefix_key" currently pointed to by validx_iter_ > use something like this as the description: Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@293 PS4, Line 293: int > make this an int64_t Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@538 PS4, Line 538: if (skip_scan_enabled_ && (static_cast(cur_idx_) > suffix_key_upper_bound_idx_)) { > Can you break this out into its own function? Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@540 PS4, Line 540: bool suffix_key_match, continue_seeking_next_prefix = false; > nit: declare these variables separately Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@542 PS4, Line 542: Status next_entry_match, suffix_key_upper_bound_match; > Can we avoid maintaining so much state outside the loop? Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@602 PS4, Line 602: continue_seeking_next_prefix = true; > not indented Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc File src/kudu/tablet/index_skipscan-test.cc: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@56 PS4, Line 56: kRandom > the random tests here are not very random. Can you add an actually-random t Thanks for this suggestion. I have added the random tests now. http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@145 PS4, Line 145: int32_t kNumDistinctP1 = 2; : int16_t kNumDistinctP2 = 10; : int kNumDistinctP3 = 5; : int8_t kNumDistinctP4 = 1; : int8_t kNumDistinctP5 = 20; > declare all of these variables const Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@163 PS4, Line 163: int32_t > int16_t for p2 here and in the below loops Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@226 PS4, Line 226: //Only 1 row inserted > nit: formatting, should be: Done. http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@253 PS4, Line 253: for (int i = 1; i <= 1; i++) { > remove this Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@255 PS4, Line 255: i+1 > style nit: please put spaces around infix operators Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@275 PS4, Line 275: p1 ++ > style nit: please no spaces before postfix operators Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@297 PS4, Line 297: > style nit: collapse >1 consecutive empty lines into a single empty line so Done http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/index_skipscan-test.cc@306 PS4, Line 306: void ScanTablet(ScanSpec *spec, vector *results, const char *descr) { > warning: parameter 'descr' is unused
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#5). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 788 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/5 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 5 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#13). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 788 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/13 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 13 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#12). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 784 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/12 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 12 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#11). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 798 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/11 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 11 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 ) Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Patch Set 4: (41 comments) Thanks for the helpful comments Mike. Please review the changes. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc@793 PS3, Line 793: > This only works if the cell is a pointer, like in the case of a string valu Yes, this works for string and binary values stored as uint8_t byte array. The ad-hoc index stores the keys that are either encoded as binary prefix blocks or binary dict blocks. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h File src/kudu/common/encoded_key.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h@63 PS3, Line 63: um_columns = 0) > document this parameter in the comment Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.cc File src/kudu/common/encoded_key.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.cc@111 PS3, Line 111: num_columns, arena)) { > I thought we were going to generalize this to any column in the compound PK Right, I have updated this now. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/partial_row.h File src/kudu/common/partial_row.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/partial_row.h@494 PS3, Line 494: rivate: > This class is a part of the public C++ client API (see KUDU_EXPORT) so we w Thanks, it makes sense. I have now added the calling class (CfileSet) as a friend class of KuduPartialRow. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/partial_row.h@498 PS3, Line 498: friend class RowOpe > Why is this private field now public? Should not be necessary. This field is used in the function - CFileSet::Iterator::PrepareBatch , to set the minimum values for some columns. I am not quite sure if there is a better alternative to this. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/wire_protocol-test-util.h File src/kudu/common/wire_protocol-test-util.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/wire_protocol-test-util.h@38 PS3, Line 38: inline void AddTestRowWithNullableStringToPB(RowOperationsPB::Type op_type, > move this to the test where it's being used I have removed this now. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h@203 PS3, Line 203: : // This function is used to place the va > These methods need doc comments Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h@230 PS3, Line 230: initted_(false), > add documentation for this method Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h@264 PS3, Line 264: size_t cur_idx_; : size_t prepared_count_; > these variables should be documented with a comment Done http://gerrit.cloudera.org:8080/#/c/10983/1/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/1/src/kudu/tablet/cfile_set.cc@669 PS1, Line 669: > instead of this loop, I wonder if we can instead "peek" at the last value s Makes sense. But I am not quite sure if we can "peek" at the last seeked ordinal in the ad-hoc index column here. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@395 PS3, Line 395: tate. > let's rename this CanUseSkipScan(), have it return a boolean, and set Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@397 PS3, Line 397: } > Add a flag DEFINE_bool(enable_skip_scan, true, "...") that allows us to tur Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@400 PS3, Line 400: int num_key_cols = base_data_->tablet_schema().num_key_columns(); > looks like a compile error? probably some unstaged changes Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@415 PS3, Line 415: for (auto it=predicates.begin(); it!=predicates.end(); ++it) { > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@418 PS3, Line 418: string col_name = (it->second).column().name(); > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@429 PS3, Line 429: nonfirst_key_pred_exists = true; > error: use of undeclared identifier 'suffix_key_id_'; did
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#4). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 889 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/4 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#10). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 889 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/10 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 10 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Kudu-1291 Efficiently support predicates on non-prefix key components
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#9). Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components .. Kudu-1291 Efficiently support predicates on non-prefix key components This patch implements index skip scan approach in the case when there exists an equality predicate on any of the non-first primary key columns. This feature uses the following approach: 1. Seek at the first unique prefix key in the ad-hoc index tree. 2. Seek to the relevant entry corresponding to the unique prefix key found in 1. and the predicate value on the suffix key column. 3. To handle the case where multiple entries exist with respect to a unique prefix key, store the upper bound index of these entries. 4. Repeat steps 1-3, till all the unique prefix keys are scanned. Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 890 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/9 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 9 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#8). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 890 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/8 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 8 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#7). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 891 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/7 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 7 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#6). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 890 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/6 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#5). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 890 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/5 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 5 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#4). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 9 files changed, 893 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/4 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#3). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/common/wire_protocol-test-util.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc 10 files changed, 907 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/3 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 3 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11067 to look at the new patch set (#2). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/common/wire_protocol-test-util.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test.cc M src/kudu/tserver/tablet_server-test-base.cc 11 files changed, 898 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/2 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 2 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Hello Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#3). Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/common/wire_protocol-test-util.h M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test-2.cc M src/kudu/tserver/CMakeLists.txt A src/kudu/tserver/tablet_server-test-2.cc M src/kudu/tserver/tablet_server-test-base.cc M src/kudu/tserver/tablet_server-test-base.h 13 files changed, 910 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/3 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 3 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column
Anupama Gupta has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11067 Change subject: WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column .. WIP 1. Added correctness test for skip scan 2. Extended the approach for equality query on any PK column Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/common/wire_protocol-test-util.h M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h A src/kudu/tablet/index_skipscan-test-2.cc M src/kudu/tserver/CMakeLists.txt A src/kudu/tserver/tablet_server-test-2.cc M src/kudu/tserver/tablet_server-test-base.cc M src/kudu/tserver/tablet_server-test-base.h 13 files changed, 910 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/11067/1 -- To view, visit http://gerrit.cloudera.org:8080/11067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iffbc240233e5ad52f6a1c7912d4e44baf32a5f2d Gerrit-Change-Number: 11067 Gerrit-PatchSet: 1 Gerrit-Owner: Anupama Gupta
[kudu-CR] WIP: KUDU-1291 Efficiently support predicates on non-prefix key components
Hello Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10983 to look at the new patch set (#2). Change subject: WIP: KUDU-1291 Efficiently support predicates on non-prefix key components .. WIP: KUDU-1291 Efficiently support predicates on non-prefix key components This is a POC for index skip scan: 1. Currently, the implementation is targeted for predicates that consist of an equality on only the last column of the compound PK. 2. In the comments, the term "prefix key" is used for the first n-1 keys of the compound PK and "suffix key" is used for the last/nth key. Change-Id: Ic7ee6433c25020cb9078faf18b35aa6a91bfc42c Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/common/wire_protocol-test-util.h M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tserver/CMakeLists.txt A src/kudu/tserver/tablet_server-test-2.cc M src/kudu/tserver/tablet_server-test-base.cc M src/kudu/tserver/tablet_server-test-base.h 12 files changed, 452 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/2 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 2 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] WIP: KUDU-1291 Efficiently support predicates on non-prefix key components
Anupama Gupta has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10983 Change subject: WIP: KUDU-1291 Efficiently support predicates on non-prefix key components .. WIP: KUDU-1291 Efficiently support predicates on non-prefix key components This is a POC for index skip scan: 1. Currently, the implementation is targeted for predicates that consist of an equality on only the last column of the compound PK. 2. In the comments, the term "prefix key" is used for the first n-1 keys of the compound PK and "suffix key" is used for the last/nth key. Change-Id: Ic7ee6433c25020cb9078faf18b35aa6a91bfc42c Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f --- M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/common/encoded_key.cc M src/kudu/common/encoded_key.h M src/kudu/common/partial_row.h M src/kudu/common/wire_protocol-test-util.h M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tserver/CMakeLists.txt A src/kudu/tserver/tablet_server-test-2.cc M src/kudu/tserver/tablet_server-test-base.cc M src/kudu/tserver/tablet_server-test-base.h 12 files changed, 455 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/10983/1 -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 1 Gerrit-Owner: Anupama Gupta
[kudu-CR] KUDU-1291 WIP
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10849 to look at the new patch set (#3). Change subject: KUDU-1291 WIP .. KUDU-1291 WIP This patch implements a simple iteration through B-Tree on a new test schema. Change-Id: Ie97986b4196a8a7c8784d6449eb3592c30fb421f --- M src/kudu/common/wire_protocol-test-util.h M src/kudu/tserver/CMakeLists.txt A src/kudu/tserver/tablet_server-test-3.cc M src/kudu/tserver/tablet_server-test-base.cc M src/kudu/tserver/tablet_server-test-base.h 5 files changed, 283 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/10849/3 -- To view, visit http://gerrit.cloudera.org:8080/10849 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie97986b4196a8a7c8784d6449eb3592c30fb421f Gerrit-Change-Number: 10849 Gerrit-PatchSet: 3 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1291 WIP
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10849 to look at the new patch set (#2). Change subject: KUDU-1291 WIP .. KUDU-1291 WIP This patch implements a simple iteration through B-Tree on a new test schema. Change-Id: Ie97986b4196a8a7c8784d6449eb3592c30fb421f --- M src/kudu/tserver/CMakeLists.txt A src/kudu/tserver/tablet_server-test-3.cc M src/kudu/tserver/tablet_server-test-base.cc M src/kudu/tserver/tablet_server-test-base.h 4 files changed, 276 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/10849/2 -- To view, visit http://gerrit.cloudera.org:8080/10849 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie97986b4196a8a7c8784d6449eb3592c30fb421f Gerrit-Change-Number: 10849 Gerrit-PatchSet: 2 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1291 WIP
Anupama Gupta has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10849 Change subject: KUDU-1291 WIP .. KUDU-1291 WIP This patch implements a simple iteration through B-Tree on a new test schema. Change-Id: Ie97986b4196a8a7c8784d6449eb3592c30fb421f --- M src/kudu/tserver/CMakeLists.txt A src/kudu/tserver/tablet_server-test-3.cc M src/kudu/tserver/tablet_server-test-base.cc M src/kudu/tserver/tablet_server-test-base.h 4 files changed, 277 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/10849/1 -- To view, visit http://gerrit.cloudera.org:8080/10849 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie97986b4196a8a7c8784d6449eb3592c30fb421f Gerrit-Change-Number: 10849 Gerrit-PatchSet: 1 Gerrit-Owner: Anupama Gupta
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10591 ) Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. Patch Set 14: (10 comments) Please review the changes. http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/consensus/log.cc File src/kudu/consensus/log.cc: http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/consensus/log.cc@997 PS12, Line 997: < k > add kLogPrefix here too Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc File src/kudu/integration-tests/ts_recovery-itest.cc: http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@88 PS12, Line 88: > There's no need to change the namespace of this test to access stuff in the Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@149 PS12, Line 149: A > s/4/3/ Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@152 PS12, Line 152: vector flags; > Please put a period (or appropriate punctuation) at the end of this sentenc Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@154 PS12, Line 154: // that is done to be able to quickly fill up the log segments in order to corrupt them > This explanation states the obvious for anyone familiar with Kudu looking a Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@166 PS12, Line 166: TestWorkload write_workload(cluster_.get()); > 5 MB seems excessive when the log segments are only 1MB. However in reality Sorry for the typo in the comment. I changed this to 32 KB to speed up the test as the maximum allowable value in this case is around 65Kb. http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@171 PS12, Line 171: > nit: leave only one blank line, not two Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@227 PS12, Line 227: } > Add a comment that this needs to be at least 60 seconds to not be flaky whe Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@230 PS12, Line 230: sensusS > nit: s/Ensures/Ensure/ Done http://gerrit.cloudera.org:8080/#/c/10591/12/src/kudu/integration-tests/ts_recovery-itest.cc@238 PS12, Line 238: // that are referenced by a tablet will not be reused. > nit: remove extra blank line Done -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 14 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Sat, 09 Jun 2018 04:35:58 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#14). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 5 files changed, 139 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/14 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 14 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#13). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 5 files changed, 140 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/13 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 13 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10591 ) Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. Patch Set 12: (23 comments) Thanks for the useful comments Mike. I have addressed all of them. http://gerrit.cloudera.org:8080/#/c/10591/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10591/7//COMMIT_MSG@18 PS7, Line 18: > after the next tablet copy. Done http://gerrit.cloudera.org:8080/#/c/10591/10/src/kudu/consensus/log.h File src/kudu/consensus/log.h: http://gerrit.cloudera.org:8080/#/c/10591/10/src/kudu/consensus/log.h@145 PS10, Line 145: * > nit: don't make these changes, keep the modifiers next to the type to be co Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.h File src/kudu/consensus/log.h: http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.h@143 PS7, Line 143: , > add: " if it exists" Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.h@145 PS7, Line 145: RemoveRecoveryDir > How about renaming this to RemoveRecoveryDirIfExists() ? Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc File src/kudu/consensus/log.cc: http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@996 PS7, Line 996: !fs_ > nit: this syntax is neat, but for consistency with the rest of the Kudu C++ Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@997 PS7, Line 997: b > nit: extra space, also make this a VLOG(1) to avoid spamming the server log Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@1002 PS7, Line 1002: con > All of these logs should also log the prefix "T P : Makes sense. Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@1005 PS7, Line 1005: str > All of these LOG messages used to be VLOG_WITH_PREFIX(1) and I think we sho Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@1006 PS7, Line 1006: VLOG(1) << kLogPrefi > nit: line up your indentation with the previous << here and elsewhere Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc File src/kudu/integration-tests/recover_tablet-itest.cc: http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@59 PS7, Line 59: > nit: Regression test for KUDU-1038. Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@60 PS7, Line 60: > or is not fsynced before a crash Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@61 PS7, Line 61: : : > slight rewrite: Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@64 PS7, Line 64: : > suggestion: Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@66 PS7, Line 66: > Can you move this test to ts_recovery-itest and make this part of TsRecover Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@79 PS7, Line 79: > This actually just creates a new tablet. Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@87 PS7, Line 87: > nit: missing trailing period per style guide per https://google.github.io/s Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@96 PS7, Line 96: : : > you can get rid of this Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@107 PS7, Line 107: > remove Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@111 PS7, Line 111: > remove this debugging log message Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@117 PS7, Line 117: > add a period at the end of your comment sentences, here and elsewhere in th Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@133 PS7, Line 133: > This comment describes the next line of code and is basically superfluous, Done. Although the test works even for cases when the deleted log segment is at index 1. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@141 PS7, Line 141: > remove this comment Done http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@144 PS7, Line 144: > remove this sleep Done -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id:
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#12). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps after the next tablet copy. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 5 files changed, 144 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/12 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 12 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#11). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 5 files changed, 144 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/11 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 11 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#10). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 5 files changed, 144 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/10 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 10 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#9). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 5 files changed, 143 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/9 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 9 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#8). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 5 files changed, 143 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/8 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 8 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#7). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/recover_tablet-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 6 files changed, 206 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/7 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 7 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#6). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/recover_tablet-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 6 files changed, 206 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/6 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#5). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/recover_tablet-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 6 files changed, 206 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/5 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 5 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#4). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/recover_tablet-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 6 files changed, 198 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/4 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#3). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/recover_tablet-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 6 files changed, 203 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/3 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 3 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory,if any
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10591 to look at the new patch set (#2). Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory,if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory,if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/recover_tablet-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 6 files changed, 201 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/2 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 2 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any
Anupama Gupta has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10591 Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any .. KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any This fix ensures successful bootstrap of a failed tablet replica, in the case when its WAL segments are missing/deleted. The following scenario is tested: 1. A log segment is deleted from one of the Tablet Replicas. 2. On server restart, the replica in step 1. enters a failed state. ( Without this fix, due to the presence of log recovery directory (with deleted log segments), the replica remains in a FAILED state and fails to bootstrap ) 3. This fix deletes the log recovery directory when the tablet data is deleted, as a result of which the failed replica successfully bootstraps. Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/recover_tablet-itest.cc M src/kudu/tablet/tablet_bootstrap.cc M src/kudu/tserver/ts_tablet_manager.cc 6 files changed, 201 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/10591/1 -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 1 Gerrit-Owner: Anupama Gupta
[kudu-CR] Add test
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10590 to look at the new patch set (#4). Change subject: Add test .. Add test Change-Id: I89ebc727d677a1aab39f3898d61ea988da8a0244 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h A src/kudu/integration-tests/delete_tablet-itest2.cc M src/kudu/tserver/ts_tablet_manager.cc 4 files changed, 196 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/10590/4 -- To view, visit http://gerrit.cloudera.org:8080/10590 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89ebc727d677a1aab39f3898d61ea988da8a0244 Gerrit-Change-Number: 10590 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Add test
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10590 to look at the new patch set (#3). Change subject: Add test .. Add test Change-Id: I89ebc727d677a1aab39f3898d61ea988da8a0244 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h A src/kudu/integration-tests/delete_tablet-itest2.cc M src/kudu/tserver/ts_tablet_manager.cc 4 files changed, 196 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/10590/3 -- To view, visit http://gerrit.cloudera.org:8080/10590 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89ebc727d677a1aab39f3898d61ea988da8a0244 Gerrit-Change-Number: 10590 Gerrit-PatchSet: 3 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Add test
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10590 to look at the new patch set (#2). Change subject: Add test .. Add test Change-Id: I89ebc727d677a1aab39f3898d61ea988da8a0244 --- M src/kudu/consensus/log.cc M src/kudu/consensus/log.h A src/kudu/integration-tests/delete_tablet-itest2.cc M src/kudu/tserver/ts_tablet_manager.cc 4 files changed, 196 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/10590/2 -- To view, visit http://gerrit.cloudera.org:8080/10590 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89ebc727d677a1aab39f3898d61ea988da8a0244 Gerrit-Change-Number: 10590 Gerrit-PatchSet: 2 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot