Adar Dembo has posted comments on this change. Change subject: KUDU-1812. Redact pretty-printed sensitive user data ......................................................................
Patch Set 4: (16 comments) http://gerrit.cloudera.org:8080/#/c/5555/1//COMMIT_MSG Commit Message: Line 7: KUDU-1812. Redact pretty-printed sensitive user data A lot of decisions went into this patch, and I think they should be documented here. For example, redaction in ToString() methods vs. redaction via KUDU_REDACT(). http://gerrit.cloudera.org:8080/#/c/5555/4//COMMIT_MSG Commit Message: Line 53: Should also add a note about what it means to use KUDU_REDACT in the C++ client (namely, that it checks the gflag, but there's not yet a way for third party applications to change the value of the gflag). http://gerrit.cloudera.org:8080/#/c/5555/1/src/kudu/cfile/cfile_util.h File src/kudu/cfile/cfile_util.h: PS1, Line 84: CFileIterator* it, : std::ostream* out, I've spent some time thinking about this warning, and there's something about it that feels a little odd. Isn't the warning self-evident given the nature of the method? If the goal is to leave a breadcrumb to "tag" methods that return strings containing row data, perhaps a more concise way to do that would be preferable here, where the warning itself is unnecessary? http://gerrit.cloudera.org:8080/#/c/5555/1/src/kudu/client/batcher.cc File src/kudu/client/batcher.cc: Line 520: // so that when the user calls Flush, we are ready to go. Arguably this is just spam without the partition key. Line 569: } Likewise, maybe just remove the string, it's not really adding any value anymore. http://gerrit.cloudera.org:8080/#/c/5555/1/src/kudu/client/scanner-internal.h File src/kudu/client/scanner-internal.h: Line 223: return strings::Substitute("Scanner { table: $0, projection: $1, scan_spec: $2 }", Nit: since there are a lot of callers, perhaps move to scanner-internal.cc? http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/common/partition-test.cc File src/kudu/common/partition-test.cc: Line 519: // Ensure that redaction does not apply to partitions, since they are metadata. Shouldn't this go above the CreatePartitions() call, for completeness? Also, I think this could be clarified a bit: "Explicitly enable redaction. It should have no effect on the subsequent partition pretty printing tests, as partitions are metadata and thus not redacted". http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/common/partition.h File src/kudu/common/partition.h: PS4, Line 204: // Returns a text description of this partition schema suitable for display in the web UI. : // The format of this string is not guaranteed to be identical cross-version. : // : // 'range_partitions' should include the set of range partitions in the table, : // as formatted by 'RangePartitionDebugString'. : std::string DisplayString(const Schema& schema, : const std::vector<std::string>& range_partitions) const; : : // Returns header and entry HTML cells for the partition schema for the master : // table web UI. This is an abstraction leak, but it's better than leaking the : // internals of partitions to the master path handlers. : std::string PartitionTableHeader(const Schema& schema) const; : std::string PartitionTableEntry(const Schema& schema, const Partition& partition) const; Should probably be annotated as being metadata and thus not redacting. Line 250: // Returns a text description of the encoded range key suitable for debug printing. Likewise, for consistency with DebugStringComponents (also private). Line 301: void AppendRangeDebugStringComponentsOrMin(const KuduPartialRow& row, Likewise. http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/common/row_operations.cc File src/kudu/common/row_operations.cc: Line 52: return Substitute("SPLIT_ROW $0", split_row->ToString()); Isn't a split point is morally equivalent to a partition boundary? http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/tools/tool_main.cc File src/kudu/tools/tool_main.cc: Line 227: FLAGS_log_redact_user_data = false; Let's add a comment explaining this. http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/util/flag_tags-test.cc File src/kudu/util/flag_tags-test.cc: PS4, Line 54: // Set to true via KuduTest, but explicitly unset here as this test deals : // with unsafe and experimental flags. Should update this comment to make it more clear that KuduTest overrides the default values of these flags, so we need to restore the default values here. http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/util/flags-test.cc File src/kudu/util/flags-test.cc: PS4, Line 44: // Set to true via KuduTest, but explicitly unset here as this test parses : // command line flags and fails if an unsafe or experimental flag is set. Likewise. http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/util/logging-test.cc File src/kudu/util/logging-test.cc: Line 195: }); Nit: indentation? http://gerrit.cloudera.org:8080/#/c/5555/4/src/kudu/util/logging.h File src/kudu/util/logging.h: Line 154: extern const char* const kRedactionMessage; This is duplicated from just before ScopedDisableRedaction. -- To view, visit http://gerrit.cloudera.org:8080/5555 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2b87a3065280116bb8af6f26f072dafdfd1ee077 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
