Note that I have a Jira up right now for a bug that Daryn found while testing FCQ internally. Not sure if it is relevant to what you are seeing. https://issues.apache.org/jira/browse/HADOOP-17342
Jim On Thu, Nov 5, 2020 at 11:43 AM Fengnan Li <loyal...@gmail.com> wrote: > Thanks for the response Daryn! > > > > I agree with you that for the overall average qtime it will increase due > to the penalty FCQ brings to the heavy users. However, in our environment, > out of the same consideration I intentionally turned off the Call selection > between queues. i.e. the cost is calculated as usual, but all users are > stayed in the first queue. This is to avoid the overall impact. > > Here are our configs, the red one is what I added for internal use to turn > on this feature (making only selected users are actually added into the > second queue when their cost reaches threshold). > > > > There are two patches for Cost Based FCQ. > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HADOOP-2D16266&d=DwIFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=7Imi06B91L3gbxmt5ChzH4cwlA2_f2tmXh3OXmV9MLw&m=MRbxfAaGc3E9KDIULflSET3ADaNAEf_zK1HtQtYpGZw&s=kR8TBw2cljp6qmNzUNSV8LPz8imJVs2fPmhW7NWa98Q&e= > and > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HDFS-2D14667&d=DwIFaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=7Imi06B91L3gbxmt5ChzH4cwlA2_f2tmXh3OXmV9MLw&m=MRbxfAaGc3E9KDIULflSET3ADaNAEf_zK1HtQtYpGZw&s=6qdKJfpUuWFFqmnXqcS5PHwRXiJiz1xSt8RaPJgw6WA&e= > . Which version are you using? > > I am right now trying to debug one by one. > > > > Thanks, > Fengnan > > > > <property> > > <name>ipc.8020.callqueue.capacity.weights</name> > > <value>99,1</value> > > </property> > > <property> > > <name>ipc.8020.callqueue.impl</name> > > <value>org.apache.hadoop.ipc.FairCallQueue</value> > > </property> > > <property> > > <name>ipc.8020.cost-provider.impl</name> > > <value>org.apache.hadoop.ipc.WeightedTimeCostProvider</value> > > </property> > > <property> > > <name>ipc.8020.decay-scheduler.blacklisted.users.enabled</name> > > <value>true</value> > > </property> > > <property> > > <name>ipc.8020.decay-scheduler.decay-factor</name> > > <value>0.01</value> > > </property> > > <property> > > <name>ipc.8020.decay-scheduler.period-ms</name> > > <value>20000</value> > > </property> > > <property> > > <name>ipc.8020.decay-scheduler.thresholds</name> > > <value>15</value> > > </property> > > <property> > > <name>ipc.8020.faircallqueue.multiplexer.weights</name> > > <value>99,1</value> > > </property> > > <property> > > <name>ipc.8020.scheduler.priority.levels</name> > > <value>2</value> > > </property> > > > > From: Daryn Sharp <da...@verizonmedia.com> > Date: Thursday, November 5, 2020 at 9:19 AM > To: Fengnan Li <loyal...@gmail.com> > Cc: Hdfs-dev <hdfs-dev@hadoop.apache.org> > Subject: Re: [E] Cost Based FairCallQueue latency issue > > > > I submitted the original 2.8 cost-based FCQ patch (thanks to community > members for porting to other branches). We've been running with it since > early 2019 on all clusters. Multiple clusters run at a baseline of ~30k+ > ops/sec with some bursting over 100k ops/sec. > > > > If you are looking at the overall average qtime, yes, that metric is > expected to increase and means it's working as designed. De-prioritizing > write heavy users will naturally result in increased qtime for those > calls. Within a bucket, call N's qtime is the sum of the qtime+processing > for the prior 0..N-1 calls. This will appear very high for congested low > priority buckets receiving a fraction of the service rate and skew the > overall average. > > > > > > On Fri, Oct 30, 2020 at 3:51 PM Fengnan Li <loyal...@gmail.com> wrote: > > Hi all, > > > > Has someone deployed Cost Based Fair Call Queue in their production > cluster? We ran into some RPC queue latency degradation with ~30k-40k rps. > I tried to debug but didn’t find anything suspicious. It is worth > mentioning there is no memory issue coming with the extra heap usage for > storing the call cost. > > > > Thanks, > > Fengnan > >