Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16031 )

Change subject: KUDU-1802: Avoid call to master when deserializing scan tokens
......................................................................


Patch Set 3:

One potential concern here -- the list of splits/tokens for a given query/job 
is O(number of tablets) and may be submitted as part of the job. Here you end 
up duplicating the table metadata O(n) times instead of just once for the job. 
Will that be problematic with tables with thousands of tablets, and big 
schemas? Are we going to hit RPC size limit or task description limit issues?

An alternate to consider is just the ability to ask the client to serialize a 
"table metadata" token, and broadcast that across the tasks (eg in a spark job) 
separately from the per-task tokens. Curious if you considered that.


--
To view, visit http://gerrit.cloudera.org:8080/16031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c1b8392de37dd5e8b7bd8b78a21603ff8b1d1b
Gerrit-Change-Number: 16031
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Fri, 05 Jun 2020 22:48:21 +0000
Gerrit-HasComments: No

Reply via email to