On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > > > > > On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.m...@gmail.com> > > > > wrote: > > > > > > > > > > On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapil...@gmail.com> > > > > > wrote: > > > > > > > > > > > > Another point in this regard is that the user anyway has an option > > > > > > to > > > > > > turn off the cost-based vacuum. By default, it is anyway disabled. > > > > > > So, if the user enables it we have to provide some sensible > > > > > > behavior. > > > > > > If we can't come up with anything, then, in the end, we might want > > > > > > to > > > > > > turn it off for a parallel vacuum and mention the same in docs, but > > > > > > I > > > > > > think we should try to come up with a solution for it. > > > > > > > > > > I finally got your point and now understood the need. And the idea I > > > > > proposed doesn't work fine. > > > > > > > > > > So you meant that all workers share the cost count and if a parallel > > > > > vacuum worker increase the cost and it reaches the limit, does the > > > > > only one worker sleep? Is that okay even though other parallel workers > > > > > are still running and then the sleep might not help? > > > > > > > > > > > Remember that the other running workers will also increase > > > VacuumCostBalance and whichever worker finds that it becomes greater > > > than VacuumCostLimit will reset its value and sleep. So, won't this > > > make sure that overall throttling works the same? > > > > > > > I agree with this point. There is a possibility that some of the > > > > workers who are doing heavy I/O continue to work and OTOH other > > > > workers who are doing very less I/O might become the victim and > > > > unnecessarily delay its operation. > > > > > > > > > > Sure, but will it impact the overall I/O? I mean to say the rate > > > limit we want to provide for overall vacuum operation will still be > > > the same. Also, isn't a similar thing happens now also where heap > > > might have done a major portion of I/O but soon after we start > > > vacuuming the index, we will hit the limit and will sleep. > > > > Actually, What I meant is that the worker who performing actual I/O > > might not go for the delay and another worker which has done only CPU > > operation might pay the penalty? So basically the worker who is doing > > CPU intensive operation might go for the delay and pay the penalty and > > the worker who is performing actual I/O continues to work and do > > further I/O. Do you think this is not a practical problem? > > > > I don't know. Generally, we try to delay (if required) before > processing (read/write) one page which means it will happen for I/O > intensive operations, so I am not sure if the point you are making is > completely correct.
Ok, I agree with the point that we are checking it only when we are doing the I/O operation. But, we also need to consider that each I/O operations have a different weightage. So even if we have a delay point at I/O operation there is a possibility that we might delay the worker which is just performing read buffer with page hit(VacuumCostPageHit). But, the other worker who is actually dirtying the page(VacuumCostPageDirty = 20) continue the work and do more I/O. > > > Stepping back a bit, OTOH, I think that we can not guarantee that the > > one worker who has done more I/O will continue to do further I/O and > > the one which has not done much I/O will not perform more I/O in > > future. So it might not be too bad if we compute shared costs as you > > suggested above. > > > > I am thinking if we can write the patch for both the approaches (a. > compute shared costs and try to delay based on that, b. try to divide > the I/O cost among workers as described in the email above[1]) and do > some tests to see the behavior of throttling, that might help us in > deciding what is the best strategy to solve this problem, if any. > What do you think? I agree with this idea. I can come up with a POC patch for approach (b). Meanwhile, if someone is interested to quickly hack with the approach (a) then we can do some testing and compare. Sawada-san, by any chance will you be interested to write POC with approach (a)? Otherwise, I will try to write it after finishing the first one (approach b). -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com