Adar Dembo has posted comments on this change.

Change subject: WIP: Expose "raw" mode in KuduScanner and allow to pass flags
......................................................................


Patch Set 5:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6624/5//COMMIT_MSG
Commit Message:

PS5, Line 10: retreived
retrieved


PS5, Line 10: should only
Is this a stylistic choice? Or is it by necessity? Meaning, is it impossible to 
safely retrieve data from a "normal" scanner via direct_data()/indirect_data()?


http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 2018:   enum { kDefaultRawModeFlags = -1 };
This is confusing. If the idea is to provide a constant that logically means 
"raw mode on but no flags", it should have a value 0 and be part of the 
RawModeFlags enum. But if it means "no raw mode at all"...isn't that the state 
of the world if you don't call EnableRawModeWithFlags()?

Or is this some internal thing? In which case, why is it here at all?


Line 2028:     PAD_UNIXTIME_MICROS_TO_16_BYTES = 1
The idea is for these flags not to be mutually exclusive, right? If so, the 
value of the first one should be "1 << 0" so it's clear that the next ones 
should be "1 << 1", "1 << 2", etc. That's how they can be bitflags and OR'ed 
together.


http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scan_configuration.cc
File src/kudu/client/scan_configuration.cc:

Line 31: using strings::Substitute;
Not used?


http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

Line 245:       (configuration().raw_mode_flags() & 
KuduScanner::PAD_UNIXTIME_MICROS_TO_16_BYTES) ==
== shouldn't be necessary.


http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/scanners.h
File src/kudu/tserver/scanners.h:

Line 100:   int64_t raw_mode_flags() const { return raw_mode_flags_; }
Why is this a ScannerManager thing? I thought it'd only be per scanner?


http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 342:   virtual void set_raw_mode_flags(int64_t raw_mode_flags) {}
> warning: parameter 'raw_mode_flags' is unused [misc-unused-parameters]
If ScanResultCollector is to be a generic "interface" as its class comment 
suggests, it shouldn't provide this default no-op implementation.


Line 410:   void set_raw_mode_flags(int64_t raw_mode_flags) override {
This is unintuitive; since collectors are constructed per RPC, I would have 
expected the right value for pad_unixtime_micros_to_16_bytes_ to be provided 
directly to the constructor. But I think I understand why: the collector is 
stack-allocated before fetching the Scanner, and reversing the order would 
require heap-allocating the collector.

Maybe you can try to explain this chicken-and-egg problem somewhere in the 
collector comments so it's clear for others?


Line 411:     if (raw_mode_flags == -1) return;
This special case won't be necessary if the default value is 0 rather than -1.


Line 413:         == RawModeFlags::PAD_UNIX_TIME_MICROS_TO_16_BYTES) {
Why is the == necessary? If you mask one bit flag from raw_mode_flags, the 
result will be either non-zero (requested) or zero (not requested).


http://gerrit.cloudera.org:8080/#/c/6624/5/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

Line 270:   // "Raw" mode flags.
I understand the desire for a "raw" mode client-side, so that users who sign up 
for timestamp padding are forced to also call  direct_data()/indirect_data() 
instead of row-by-row access.

But why is it exposed to the server? AFAICT, there's no server-side effect. The 
only thing that matters is the timestamp padding, and that can be exposed via 
something like TimestampPaddingMode (similar to ReadMode or OrderMode).


-- 
To view, visit http://gerrit.cloudera.org:8080/6624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I043b6514dc5fc307fc9c94eb41f3ae79796ba273
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to