Todd Lipcon has submitted this change and it was merged. Change subject: bshuf_block: fix GetFirstKey() and GetLastKey() ......................................................................
bshuf_block: fix GetFirstKey() and GetLastKey() This fixes a bug where the bitshuffle encoding would return the wrong value for GetFirstKey() and GetLastKey(). The issue was that, in the case of UINT32s, the data_ array was being inspected after 'Finish()' had already collapsed down the elements to their minimum width. Since this collapsing is only done on UINT32 types, this would only affect these functions on dictionary-encoded binary blocks, which use bitshuffle internally for storing the UINT32 code words. However, the "cleanup" commit prior to this one also fixed some incorrect indexing that would affect BIT_SHUFFLE blocks outside the context of UINT32. These functions are only used in the case that the column in question is a non-composite primary key, in order to build the value index as well as to determine the min/max bounds of a written CFile. The result of returning the wrong value for GetLastKey() was that the resulting index blocks would be incorrect, and seeks based on key would end up at the wrong row. This could cause spurious "row not found" errors or even crashes in some cases. Apparently real users have not made use of BITSHUFFLE or DICT_ENCODING on such columns, since we haven't seen reports of this in the wild; I instead discovered this issue while working on KUDU-1751 (changing those encodings to be the default). The patch fixes the issue and also expands test coverage to cover the GetFirstKey/GetLastKey calls. Change-Id: I83dbc01ebf7a10e69b7fff6b7973967ebf6b4580 Reviewed-on: http://gerrit.cloudera.org:8080/5192 Tested-by: Kudu Jenkins Reviewed-by: Dan Burkert <[email protected]> --- M src/kudu/cfile/bshuf_block.cc M src/kudu/cfile/bshuf_block.h M src/kudu/cfile/encoding-test.cc 3 files changed, 49 insertions(+), 11 deletions(-) Approvals: Dan Burkert: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I83dbc01ebf7a10e69b7fff6b7973967ebf6b4580 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]>
