Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3481: Use Kudu ScanToken API for scan ranges ......................................................................
Patch Set 4: (21 comments) http://gerrit.cloudera.org:8080/#/c/4120/4/be/src/exec/kudu-scan-node-test.cc File be/src/exec/kudu-scan-node-test.cc: Line 53: class KuduScanNodeTest : public testing::Test { we don't have unit tests for any other exec node. why is this an exception? http://gerrit.cloudera.org:8080/#/c/4120/4/be/src/exec/kudu-scan-node.cc File be/src/exec/kudu-scan-node.cc: Line 247: VLOG(1) << "Thread started: " << name; 1 is too chatty. also, please use the macros. Line 328: VLOG(1) << "Thread done: " << name; change http://gerrit.cloudera.org:8080/#/c/4120/4/be/src/exec/kudu-scan-node.h File be/src/exec/kudu-scan-node.h: Line 39: /// are used to retreive the rows for this scan. retrieve Line 120: void ScannerThread(const string& name, const string* initial_token); rename to some verb Line 125: Status ProcessScanToken(KuduScanner* scanner, const string* scan_token); why const*? can this be null? http://gerrit.cloudera.org:8080/#/c/4120/4/be/src/exec/kudu-scanner.cc File be/src/exec/kudu-scanner.cc: Line 114: if (batch_done) break; why break instead of return? http://gerrit.cloudera.org:8080/#/c/4120/4/be/src/exec/kudu-scanner.h File be/src/exec/kudu-scanner.h: Line 34: /// Wraps a Kudu client scanner to fetches row batches from Kudu. The Kudu client scanner to fetch Line 114: int cur_kudu_batch_num_scanned_; explain http://gerrit.cloudera.org:8080/#/c/4120/4/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 976: scan_range_length = 1000; can't the fe get that out of the scan token? http://gerrit.cloudera.org:8080/#/c/4120/4/fe/src/main/java/com/cloudera/impala/planner/KuduScanNode.java File fe/src/main/java/com/cloudera/impala/planner/KuduScanNode.java: Line 107: try (KuduClient client = new KuduClientBuilder( break before 'new' instead Line 131: * TODO: Remove when the columns are fetched from Kudu directly during analysis. does kudu cache all metadata? where would the column info get fetched from? Line 139: tableSchema.getColumn(colName); note to kudu api designers: it's nice not to use exceptions to signal return values. is that a sufficient test? what about data types? Line 165: // TODO: Consider only using the LEADER replica. for better cache locality? Line 167: TNetworkAddress address = new TNetworkAddress(replica.getRpcHost(), break before new instead Line 203: for (KuduPredicate predicate: kuduPredicates_) { single line Line 267: * bounds of the result can be evaluated with Expr::GetValue(NULL)). this presumable has some side effects. describe them. Line 295: // Cannot push prediates with null literal values is there a kudu jira for this? Line 349: private static ComparisonOp getKuduOperator(BinaryPredicate.Operator op) { use qualified name for ComparisonOp, easier to identify that it's a "foreign" type that way Line 351: case GT: return ComparisonOp.GREATER; indentation http://gerrit.cloudera.org:8080/#/c/4120/4/fe/src/test/java/com/cloudera/impala/planner/KuduPlannerTest.java File fe/src/test/java/com/cloudera/impala/planner/KuduPlannerTest.java: Line 33: public class KuduPlannerTest extends PlannerTestBase { could we merge this back into the regular planner test? what was the reason for separating this? -- To view, visit http://gerrit.cloudera.org:8080/4120 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I160e5849d372755748ff5ba3c90a4651c804b220 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
