[
https://issues.apache.org/jira/browse/DISPATCH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866295#comment-16866295
]
Francesco Nigro commented on DISPATCH-1371:
-------------------------------------------
{quote}When trying to figure out a use-after-free bug it is harder when the
most recently used object is put back in service immediately this way.
{quote}
with free do you mean qd_dealloc or a *real* free? Please help me to understand
with an example :D
Anyway I believe that such errors should be exceptional and I would optimize
the pool for the common case instead ie under normal/correct usage, unless it
would make it less robust.
{quote}If this change is known to be a faster way to go overall then I won't
object to it.
{quote}
Good point, that's yet to be proved: I've raised the PR because the change is
trivial and because is part of a bigger work of making qd_alloc/qd_dealloc
smarter while avoiding the effort to redesign it, if possible.
I will try the effects of this change in the next few days (hopefully) and
come back here to share my findings: maybe it won't make any difference,
because the bigger cost is to perform data dependent loads due to the linked
list pointer chasing ie we would just save 1 cache miss, if lucky enough.
Are you aware if in other similar projects we've used a different approach to
this?
{quote}With the DEQ of objects doesn't adding an object to end of a queue also
change the pointer at the head of the queue?
{quote}
The only case where a change in the tail modify the head (in this change, but
same was for the original version) is when the the free_list is empty or with
just 1 element (the former when we insert something, the latter while removing
the only element): the possible benefits (to be proved) are when there are
already N elements into free_list and these M < N elements are used and
released to the pool right after eg
e_0 =qd_alloc(...) , e_1 =qd_alloc(...), e_2 =qd_alloc(...) ... e_M
=qd_alloc(...)
//we use e_* and they will remain (worst case) each one on its cache line
qd_dealloc(e_0), qd_dealloc(e_1), qd_dealloc(e_2) ... qd_dealloc(e_M)
//e_* are yet each one on its cache line
e_0 =qd_alloc(...) , e_1 =qd_alloc(...), e_2 =qd_alloc(...) ... e_M
=qd_alloc(...)
// each e_* won't get any cache misses because they were already hot on the
cache
If we use a FIFO ordering the risk is to offer to the pool hot nodes on
qd_dealloc, but get back colder ones on qd_alloc.
The *real* waste IMHO is:
# the CPU cannot speculate in either cases after reading one node which one
will be next (there is no sequential pattern to exploit)
# the nodes are not packet together on the same cache lines or on adiacent
cache lines
My change cannot improve any of these issues, so maybe we won't see evident
benefits after applying it.
> qd_alloc can reuse instances in LIFO order
> ------------------------------------------
>
> Key: DISPATCH-1371
> URL: https://issues.apache.org/jira/browse/DISPATCH-1371
> Project: Qpid Dispatch
> Issue Type: Improvement
> Affects Versions: 1.8.0
> Reporter: Francesco Nigro
> Priority: Minor
>
> qd_dealloc is inserting instances on the tail of the thread local free_list
> while qd_alloc is reading from the head, always causing cache misses unless
> free_list size is 1.
> qd_alloc could instead reuse the last inserted instance ie the tail, using
> free_list as a stack (with LIFO accesses).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]