Todd Lipcon has posted comments on this change.

Change subject: Non-covering Range Partitions design doc
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/2772/1/docs/design-docs/non-covering-range-partitions.md
File docs/design-docs/non-covering-range-partitions.md:

Line 16: to all possible
       : range values
I think this would be more clear to say "tablets which cover the entire range 
of possible keys" or something. "All possible range values" doesn't quite make 
sense to me


Line 61: exclusive upper bound
I agree this makes life easier for us, but doesn't this make life difficult for 
the LIST-partitioning use case? Or even date-based partitioning, etc.

Given all the machinery we already have for converting between exclusive and 
non-exclusive predicates, could we reuse a lot of that code to offer inclusive 
upper-bound as an API (even if we then Increment it internally?)


Line 87:               RANGE BOUND (("North America"), ("North America\0")),
       :               RANGE BOUND (("Europe"), ("Europe\0")),
       :               RANGE BOUND (("Asia"), ("Asia\0"));
> The examples shown are "exact match" types of partitions, and I wonder if s
Ah yes, this is the use case I noted above, and the thing that I think is too 
ugly to make users do.


Line 121: only
        : recontacting the master after a configurable timeout.
> I'm trying to come up with a perfectly consistent way to avoid skipping val
Yea, I dont think all of the above is worth it. An explicit "refresh" command 
could be useful, and otherwise I think we can document that existing clients 
might not see added partitions immediately.

For the typical use case of creating a partition for each day, you'd create 
tomorrow's partition at 11pm or something anyway, so a 5 or 10 second lag in 
visibility to readers shouldn't be a big deal, so long as it's clearly 
documented.


Line 140: In the case of a scan, no results will be
        : returned from the non-existent tablet.
> Should this be configurable? Is there a use case wherein an application _wa
I don't think so, because logically you've just removed the data. The tablets 
are more of an implementation detail from the data perspective. (at least we 
should endeavour to have that be the case as much as possible).


Line 159: dropped range partitions.
in person we had discussed some ideas around keeping dropped tablets alive for 
some amount of time, with a "dead_as_of_timestamp" metadata entry attached. 
Snapshot scans in the past would see these tablets as live, and then we'd 
_actually_ delete them based on the same logic we use to actually delete old 
UNDO deltas.

I think we decided it might be a bit complicated to implement but that it was 
certainly possible and had better semantics for the Impala/MR case, even if we 
don't want to do it in rev 1.


-- 
To view, visit http://gerrit.cloudera.org:8080/2772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e530eda60c00faf066c41b6bdb2b37f6d96a5dc
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Binglin Chang <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to