> No. I am thinking to check none of them. I did the test and it appears that > the performance suffers from just walking through the message chain to get > the pointer of the tail message.
Ouch. One possibility would be to to let the messages pile up and periodically (maybe once a second when flow controlled?) have a reaper thread run, tally the number of messages, and free the part of the chain that's over the limit. This way, the messages are only counted e.g. once a second instead of once every call to dld_tx_enqueue(). (It could also move the messages that it's already counted to a separate message queue, so that subsequent counts would just need to process the newly received messages.) Short of that, I think the best you could do is optimize the existing counting code. For instance, I see no reason why get_mpsize() couldn't process the entire message chain (avoiding the repeated function calls), or why we couldn't inline the MBLKL() macro call in the common case of an M_DATA message with no b_cont. But if most of the overhead is in traversing the list, that probably won't do much good. -- meem
