Hi Kavinder, If your problem "multiple REST requests" is really caused by planner invoked twice, my suggestions is to avoid the related function call(maybe a call to rest service) at the first plan stage.
You can refers to src/backend/optimizer/util/clauses.c:2834 using is_in_planning_phase() to determine whether the legacy planner is at the first or the second stage. Thanks On Wed, Aug 31, 2016 at 12:31 AM, Lirong Jian <[email protected]> wrote: > Hi Kavinder, > > If you are looking for some guy "to be blamed" for making the decision > of invoking the planner twice on master, unfortunately, that guy would be > me, :). > > First of all, I have left the company (Pivotal Inc.) about haft a year ago > and was no longer working on the Apache HAWQ project full time anymore. And > I have little time to keep tracking the code changes these days. Thus, what > I am going to say is probably inconsistent with the latest code. > > * Problem* > > To improve the scalability and throughput of the system, we want to > implement the dynamic resource allocation mechanism for HAWQ 2.0. In other > words, we want to allocate exactly needed resource (the number of virtual > segments), rather than the fixed amount of resource (like HAWQ 1.x), to run > the underlying query. In order to achieve that, we need to calculate the > cost of the query before allocating the resource, which means we need to > figure out the execution plan first. However, in the framework of the > current planner (either the old planner or the new optimizer ORCA), > the optimal execution plan is generated given the to-be-run query and the > number of segments. Thus, this is a chicken and egg problem. > > *Solution* > > IMO, the ideal solution for the above problem is to use an iterative > algorithm: given a default number of segments, calculate the optimal > execution plan; based on the output optimal execution plan, figure out the > appropriate number of segments needed to run this query; and calculate the > optimal execution plan again, and again, until the result is stable. > > *Implementation* > > In the actual implementation, we set the number of iterations to 2 for two > major reasons: (1) two iterations are enough to give out a good result; (2) > there is some cost associated with invoking the planner, especially the new > optimizer ORCA. > > After implementing the first version, we later found that determining the > number of virtual segments based on the cost of the query sometimes gave > out very bad results (although this is the issue of the planner, because > the cost of the planner provided doesn't imply the actual running cost of > the query correctly). So, borrowing the idea from Hadoop MapReduce, we > calculate the cost based on the total size of all tables needed to be > scanned for the underlying query. It seemed we don't need to invoke the > planner before allocating resource anymore. However, in our current > resource manager, the allocated resource is segment-based, not > process-based. For example, if an execution plan consists of three slices, > meaning we need to setup three processes on each segment to run this query. > One allocated resource unit (virtual segment) is for all three processes. > In order to avoid the case where too many processes are started on one > physical host, we need to know how many processes (the number of slices of > the execution plan) are going to start on one virtual segment when we > require resource from the resource manager. Thus, the execution plan is > still needed. We could write a new function to calculate number of slices > of the plan rather than invoking the planner, but after some investigation, > we found the the new function did almost the same thing as the planner. So, > why bother writing more duplicated code? > > *Engineering Consideration* > > IMO, for the long term, maybe the best solution is to embed the logic > of resource negotiation into the planner. In that case, the output of the > planner consists of the needed number of virtual segments and the > associated optimal execution plan. The planner can be invoked just once on > master. > > However, back to that time, we decided to separate the functionalities of > resource negation and planner completely. Although it may looks a little > ugly from the architecture view, it saved us a lot of code refactoring > effort and the communication cost among different teams. We did have a > release deadline, :). > > Above is just my 2 cents. > > Best, > Lirong > > Lirong Jian > HashData Inc. > > 2016-08-30 1:42 GMT+08:00 Goden Yao <[email protected]>: > > > Some back ground info: > > > > HAWQ 2.0 right now doesn't do dynamic resource allocation for PXF queries > > (External Table). > > It was a compromise we made as PXF used to have its own allocation logic > > and we didn't get a chance to converge the logic with HAWQ 2.0. > > So to make it compatible (on performance) with 1.x HAWQ, the current > logic > > will assume external table queries need 8 segments per node to execute. > > (e.g. if 3 nodes in the cluster, it'll need 24 segments). > > If that allocation fails, the query will fail and user will see the error > > message like "do not have sufficient resources" or "segments" to execute > > the query. > > > > As I understand, the 1st call is to get fragment info, 2nd call is to > > optimize allocation for fragments to segments based on the info got from > > 1st call and generate the optimized plan. > > > > -Goden > > > > On Mon, Aug 29, 2016 at 10:31 AM Kavinder Dhaliwal <[email protected] > > > > wrote: > > > > > Hi, > > > > > > Recently I was looking into the issue of PXF receiving multiple REST > > > requests to the fragmenter. Based on offline discussions I have got a > > rough > > > idea that this is happening because HAWQ plans every query twice on the > > > master. I understand that this is to allow resource negotiation that > was > > a > > > feature of HAWQ 2.0. I'd like to know if anyone on the mailing list can > > > give any more background into the history of the decision making behind > > > this change for HAWQ 2.0 and whether this is only a short term solution > > > > > > Thanks, > > > Kavinder > > > > > > -- Thanks Hubert Zhang
