David Ribeiro Alves has posted comments on this change. Change subject: WIP: Expose "raw" mode in KuduScanner and allow to pass flags ......................................................................
Patch Set 5: (19 comments) http://gerrit.cloudera.org:8080/#/c/6624/5//COMMIT_MSG Commit Message: PS5, Line 10: should only > Is this a stylistic choice? Or is it by necessity? Meaning, is it impossibl I changed this according to feedback. PS5, Line 10: retreived > retrieved Done http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/client.h File src/kudu/client/client.h: Line 2018: enum { kDefaultRawModeFlags = -1 }; > This is confusing. If the idea is to provide a constant that logically mean k, changed this to 0, changed EnableRawModeWithFlags to SetRawModeFlags Line 2018: enum { kDefaultRawModeFlags = -1 }; > yea, I'd expect the default to be 0 Done Line 2028: PAD_UNIXTIME_MICROS_TO_16_BYTES = 1 > The idea is for these flags not to be mutually exclusive, right? If so, the yeah, it's just that at this point we only have one. I'll use that notation to make it clear. http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scan_configuration.cc File src/kudu/client/scan_configuration.cc: Line 31: using strings::Substitute; > Not used? Done Line 182: raw_mode_flags_ = flags; > we should probably check that there are no unsupported flags set Done http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scan_configuration.h File src/kudu/client/scan_configuration.h: Line 139: return raw_mode_flags_ != KuduScanner::kDefaultRawModeFlags; > strikes me as odd, since this is -1 (all flags set) yeah, was kind of thinking of using the first (sign) bit to distinguish between raw mode and non-raw mode. changed this to 0 anyway. http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scanner-internal.cc File src/kudu/client/scanner-internal.cc: Line 245: (configuration().raw_mode_flags() & KuduScanner::PAD_UNIXTIME_MICROS_TO_16_BYTES) == > == shouldn't be necessary. Done PS5, Line 536: raw_mode_flags_ != KuduScanner::kDefaultRawModeFlags > yea, this defaultRawModeFlags thing is funny looking here - I'd expect the I've changed this to NO_FLAGS. I don't think we need to go into format versioning since it's not really required for this patch or presently. If we ever add it in the future we can add a flag for the old format (i.e. RAW_MODE_V1 as you suggest) and have NO_FLAGS mean the unaltered "client supported" format. Maybe it would make sense then to also have ROW_FORMAT_V2 so that you could padd other format modifiers with both versions, but again I don't think we need to handle that right now. and the current impl doesn't preclude adding that later. Line 550: LOG(FATAL) << "Cannot extract rows, \"raw\" mode"; > would it be possible for a user in a release build to hit this case? Or wou Users can hit this if they set a modifier flag but then use KuduScanner::NextBatch(vector<KuduRowResult>* rows) to decode the rows. http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scanner-internal.h File src/kudu/client/scanner-internal.h: Line 256: Status Reset(rpc::RpcController* controller, > warning: function 'kudu::client::KuduScanBatch::Data::Reset' has a definiti Done http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/scanners.h File src/kudu/tserver/scanners.h: Line 100: int64_t raw_mode_flags() const { return raw_mode_flags_; } > yea oops. Done http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: Line 342: virtual void set_raw_mode_flags(int64_t raw_mode_flags) {} > If ScanResultCollector is to be a generic "interface" as its class comment I was torn about this. I pondered the alternative you suggest, but I don't like that I'd have to have either checksum collector either implement this as a no op or have to make this return a status. Both options seem worse than having an empty impl here, IMO Line 410: void set_raw_mode_flags(int64_t raw_mode_flags) override { > This is unintuitive; since collectors are constructed per RPC, I would have Done Line 410: void set_raw_mode_flags(int64_t raw_mode_flags) override { > would it be simpler if we pass the raw-mode flags on every ContinueScan RPC In order to do that we'd have to move the field from NewScanRequestPB to ScanRequestPB which I don't think makes much sense. Particularly because, as we add flags, we'd have to perform all the necessary validation once per continue scan request versus once per new scan. Line 411: if (raw_mode_flags == -1) return; > This special case won't be necessary if the default value is 0 rather than Done Line 413: == RawModeFlags::PAD_UNIX_TIME_MICROS_TO_16_BYTES) { > Why is the == necessary? If you mask one bit flag from raw_mode_flags, the Done http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/tserver.proto File src/kudu/tserver/tserver.proto: Line 270: // "Raw" mode flags. > +1, it shouldn't distinguish raw vs not-raw. At this level all scans are ki The distinction between raw and non-raw is gone. I feel like we went back and forth regarding this enough and progress needs to be made. When we first discussed the bitset solution it was mentioned that it could be useful if in the future we wanted to add new flags for things like row format versions without changing the protocol, which this solution supports. None of this is ideal honestly, but I'm personally not a fan of adding a TimestampPaddingMode either, it feels too "first class" for what this feature currently is. -- To view, visit http://gerrit.cloudera.org:8080/6624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I043b6514dc5fc307fc9c94eb41f3ae79796ba273 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
