Hi all, Here is a change <https://gerrit.cloudera.org/4554> that implements a benchmark for SimpleScheduler::ComputeScanRangeAssigment() to address IMPALA-4086 <https://issues.cloudera.org/browse/IMPALA-4086>.
I would like to discuss whether it is possible to run the benchmark against the Schedule() method instead. This would require changes to the scheduler test utility classes in simple-scheduler-test-util.h to create a TQueryExecRequest message suitable for calling Schedule(). Currently we compute these fields before calling ComputeScanRangeAssignment(), which are basically what is contained in a single plan node. BackendConfig > vector<TScanRangeLocations> > vector<TNetworkAddress> > TQueryOptions To build a schedule object we need to build a TQueryExecRequest, which has 14 fields. The complex ones are: optional Descriptors.TDescriptorTable desc_tbl > optional list<Planner.TPlanFragment> fragments > optional list<i32> dest_fragment_idx > optional map<Types.TPlanNodeId, list<Planner.TScanRangeLocations>> > per_node_scan_ranges > optional list<TPlanExecInfo> mt_plan_exec_info > optional Results.TResultSetMetadata result_set_metadata > optional TFinalizeParams finalize_params > required ImpalaInternalService.TQueryCtx query_ctx > optional string query_plan > required list<Types.TNetworkAddress> host_list > optional LineageGraph.TLineageGraph lineage_graph Some of these members have other dependencies, for example the fragments have the plan inside, which has all plan nodes: TQueryExecRequest: > list<Planner.TPlanFragment> fragments > partition.type > plan.nodes[node_id] > node_id (for dcheck) > node.hdfs_scan_node (can be unset) > idx (for sorting in query-schedule) > TQueryCtx query_ctx (only for query options, which we already have) I think it makes sense to benchmark ComputeScanRangeAssignment() in isolation, since its implementation is reasonably complex, i.e. not just linear in the input size. In order to benchmark Schedule(), we should first consider writing proper unit tests for the SimpleScheduler and extend the test utility code where necessary to do so. I curious for any feedback. Thanks, Lars
