On Wed, 24 Jun 2020, Wietse Venema wrote:

> > > ?Problem: nqmgr's fairness breaks down for single-recipient bulk mail 
> > 
> > You describe the solutions but maybe you could describe the problem first?
> 
> My text was a bit disorganized, but the gist is that nqmgr uses a
> clever approach to interleave (bulk mail) multi-recipient deliveries
> with (non-bulk) single-recipient deliveries. This allows the non-bulk
> mail to trickle out *during* bulk delivery, instead of being stuck
> *after* the bulk delivery. I am making a simplification as nqmgr
> will also interleave workloads with other sizes, not just
> single-recipient and multi-recipient.
> 
> When the bulk mail is in the form of single-recipient messages,
> then nqmgr cannot distinguish between bulk mail and non-bulk mail,
> and these deliveries are no longer interleaved. Deliveries to a
> specific destination become sequential. Again, I am making
> a simplification, this time that all email is single-recipient.

Yep, this is a nice summary of how it works. And the below is some 
solution, but I am still not sure which problem you are trying to solve.

(I really apologize, maybe it's been part of some prior discussion and 
thus I am missing the bit which lead to that post. Honestly, I am not 
trying to play dumb here.)

It's obviously some congestion bothering you. So let me ask: where 
is the bottleneck?

1) paricular destination is congested
2) all delivery agents are congested
3) in-memory queue is congested
4) something else?

and what is the "blocker" attribute?

A) particular mail recipient is the hog
B) destination itself is the hog (not happening AFAIK)
C) particular mail sender is the hog
D) particular mail is the hog
E) single recipient mail blocking multi recipient mail (not happening AFAICT)
F) multi recipient mail blocking single recipient mail (not happening AFAICT)
G) something else?

>From the rest of the following mails I have gathered that it may be the 
same sender blocking things (which may be either delivery agents or 
in-memory queue, so 2C or 3C, I am not really sure) is what is bugging 
you. But that's what confuses me even further because separate 
per-mail-recipient-count classes don't help with that, do they? (If all 
you get in your system is single recipient mail, this type of classes 
doesn't change a thing.)

But if that's really the case (the single sender usurping everything), why 
complicate things? Rather than dealing with that during the outbound 
selection, I'd say cut it during the inbound phase. Don't even let it into 
the system. Simply don't pickup mail from sender who already has too many 
recipients in in-memory queue during the active queue scan. It prevents it 
from clogging the queue or even the agents (depending on the limit), is 
trivial to do, and requires no change to the scheduling algorithm itself.

> The idea then is to have three FIFO classes with delivery requests
> that are round-robin interleaved by design: one class for senders
> with, say, 1-20 pending delivery requests, one class for senders
> with 20-400, and one class for senders with more than 400 pending
> delivery requests. The scheduler has up to 20,000 delivery requests
> in memory.
> 
> For a given transport and destination, if all three classes have
> work then round-robin interleaving gives each class gets 1/3 of the
> delivery slots, 1/2 if there are only two classes with work, all
> slots if there is only one class witth work, and zero if there is
> no work.
> 
> If implemented in oqmgr, it's brain-dead round-robin with FIFO.
> What could possibly go wrong? :-) Well it does not interleave
> deliveries within the same class. But, with single-recipient messages,
> that wasn't happening anyway, so we are not making things worse.

Let's put aside that I don't know what you are solving. Regardless of that 
what you suggest has it's own share of problem. You either have too little 
classes, or too many (David pointed at this as well, if I understood it 
correctly within the context I have so far). Each suck in a different way.

1) Too little classes - in which class will 100 recipient mail fit? Will 1 
recipient mail have to wait until several 100 recipient mails before it 
are delivered? Or will 100 recipient mail wait until few 1000 recipient 
mails before it are delivered? No matter how you set the thresholds, with 
too little classes it will always be too coarse and will cause substantial 
delays.

(For comparison, nqmgr has kind of implicit log2(n) classes, even if their 
boundaries may be a bit more fuzzy)

2) Too many classes - let's assume you will have log2(n) classes (similar 
to what nqmgr implicitly has). It seems to work fine, but actually that's 
where the round robin hits us hard - you inflate the delivery of some 
mails beyond reasonable. Let's say you have 16 classes, all delivering 
mail. With round robin, you are inflating delivery of mail in each class 
by deliveries of all other classes. This means each mail can be 
extended 16 times. As you put it some 20+ years ago when I was proposing 
round robin for the first time - first recipients of some mails all arrive 
today while last recipients of the same mails arrive tomorrow. It sucked 
then and I believe it still sucks today.

(For comparison, nqmgr never inflates any mail delivery more than twice, 
even with the most aggressive preemtpion settings enabled).

So, this is why I think what you propose is not the way to go. 
Unfortunately, I don't think there is any sweet spot of "just the right 
amount of classes" which would work wonderfully. I am afraid it would 
suffer from the worst of both cases, being too coarse to delay too much of 
shorter mails while inflating some deliveries way too much at the same time.

> > I don?t remember any of it now but maybe I wrote the most important
> > ideas somewhere, so I can have a look.
> 
> I remember we discussed classes, but I don't think that I was able
> to reduce the concept to a combination of really brain-dead simple
> concepts.
> 
> I think it would work (in oqmgr), I just need to find time to do this.

Well, I don't want to discourage you, but the above is why I am rather 
skeptic about the result.

FWIW, in the mean time, I had a look and found my notes from July 2013, 
when we were last discussing the classes. It's quite interesting read (for 
me, at least). It contains various notes for the arguments I prepared for 
our future discussions, as well as the solution to the problem we were 
solving back then.

It turns out we were indeed discussing the classes as well, but those were 
"slow" and "fast" classes instead. My biggest gripe with nqmgr (or oqmgr, 
for that matter) is that on a saturated system, deliveries of "slow" mail 
to "slow" destinations eventually hog down most of the delivery agents, 
considerably slowing any other, both "slow" and "fast", mail sitting in 
the queue.

We were trying to come up with a solution that would solve this by using 
the mail size as an attribute somehow, and dedicating some portions of the 
agents to certain classes. After some serious thinking my revelation was 
that to solve this properly, we should in fact measure the time the mail 
spends "on the wire" (effectively measuring per-destination throughput), 
and use this information to drive the delivery agent assignment decisions. 
It's quite obvious in the hindsight, but that's something which we have 
never discussed before. The solution I had back would work marvelously 
even with a single agent, but in the end I had never pursued this further 
to turn it into an actual implementation.

I attach the notes for your perusal, maybe you'll find some of it 
interesting, too. It's quite brief to be generally useful as is, but if 
something piques your interest, I can try to remember and turn it into 
something more understandable.

BTW, one section even discusses the explicit vs implicit preemption, which 
is basically the class based round robin being discussed above. :)

Best,

Patrik
Queues are sequential. Order matters.

Metrics for messed up ordering is minimum number of swaps needed to get from 
order A to order B.
Every swap breaks order in one pair, helping one message nad harming the other 
one.
So this metrics gives the number of hurt (as well as helped) messages.

Fairness is an ellusive term. Average message delays and number of reordering 
swaps are a metric. Fairness is not.

Solving congestion alone is simple. The point is doing it in such a way that 
only improves things.

---------------

What user wants from the mail system (fast, order preserving within reason).

Tracking both per user, as well as globally (although the global reordering is 
less relevant).
Both are interesting from recipient view as well, not only sender.

Sending big message, then small. Or list mail, then single.

------------------

So, the point is to minimize order swaps and minimize delays.
Increasing swaps slightly is ok if it leads to considerably decreased delays,
but whatever leads to increasing both swaps and delays at the same time is 
stupid.

The relations between classes are not symetrical.
When putting class A in front of B improves things, doing the opposite can't 
improve things as well.

Example with old lady with a full shopping cart, both right and wrong.

Example with separate queues for black and white. (or trucks and cars on 
horizon).
Example with big message sent much later than small, but arrives first.

Example with class storming the shop.

-------------------------------

Some problems are quantitative, some qualitative. They manifests themselves 
differently. 

(The former are concerned with multitude (count), while the others with 
magnitude (order) (which are btw both quantitative properties (measurable), by 
the way, not qualitative properties (taste, like). OTOH, in general, quality 
refers to attribute or property, which are not measurable on its own (red), 
unless one assign thems some meaning or order. ))

It's possible to convert back and forth, though - counting items of some 
quality or grouping object of some quantity.

Quantitative problems can be solved by counting. Qualitative can't, at least 
not in an optimal way.

Example with car with empty gas tank, (frozen diesel), and diesel with petrol 
added. All make the car stop.
Recognizing what's the problem is different, though, as is fixing it.

XXX - not the best analogy... Example with two cannisters, one with diesel, one 
with diesel with petrol added. Which one to use? Either you compare, or you get 
average results (mixing stuff).

-------------------------------

Two people sending list mail vs two people sending bulk mail - the first should 
be sequential, while the
latter could be either sequential or round robin, depending on when and how 
they sent it (interleaved vs sequential).
Fundamental difference. The list sender has this guarantee, bulk doesn't.

Also, explicit list mail preemption is smarter in candidate selection. Simple 
implicit preemption would just choose whatever's available next - not good 
enough wrt message/recipient delays. Could lead to stretching list delivery too 
much by simple interleaving, because it lacks slot grouping and thus tighter 
message start/stop delay.

Also, explicit preemption works with single agent. Implicit one can be done for 
quantitative ones by counting, but not for qualitative ones - you need to 
somehow measure. That's why the latter works better with implicit preemption 
using multiple delivery agents, as it does the measuring for us. But for more 
than two classes, one needs the ordering, to prevent starving the remaining 
classes. Or suffer the average results, as usual.


---------------------

replace number of agents used with leaky bucket based predicate. Basically time 
on wire used by each class. Has to use time estimates (corrected afterwards) 
instead of self-adjusting measurement of multiple agents. The rest works the 
same.

Hmm, the catch is that we don't know how many delivery agents there is (of 
course) - so we have no problem to limit some class to X% of the total when 
there is enough stuff being delivered, but it is difficult to leave Y% of 
delivery agents idle when there is nothing but mail from that class going out. 
Could it be somehow derived from how much is still on the wire compared to how 
much has already finished? 

In other words, given single agent, having just sent slow message, how do we 
know if we can send the next slow or if we should look if there is fast one? Or 
do we look always, and only use the slow one anyway if there is nothing better 
found at all? Maybe that's how we do it - instead of using the first which can 
go or nothing, we use the best candidate we find - of course, the unrestricted 
one being the winner, if there is one.

Ahaaa, actually we know how many there are ready in each class, so it is merely 
the matter of checking if there is something better at all - in which case we 
will find it (unless it's concurrency window restricted, of course), or we 
don't have to bother.
Hmm, but it can be one of the blockers BEFORE the current job, so the best 
choice fallback has to be used, anyway.

The big advantage is not only it is independent on the total number of agents, 
but also that it automatically adapts to any number of agents on the fly - if 
the total limit is 1000 but only 20 are being used because of concurrency 
window(s), it adjusts to this limit automatically.

-----

Another type of congestion would be per destination congestion. It affects the 
per destination average message delays. It's similar to bulk sender problem - 
if the messages/entries for/from few Xs consume to much time, the other Xs are 
congested. Not really a problem in practice, but nice example.

Metrics:
Order swaps (increasing)
Average message delays
Per sender delays
Per destination delays
Total time it takes to deliver everything (same)
...

----

Reply via email to