[ 
https://issues.apache.org/jira/browse/DISPATCH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866295#comment-16866295
 ] 

Francesco Nigro commented on DISPATCH-1371:
-------------------------------------------

{quote}When trying to figure out a use-after-free bug it is harder when the 
most recently used object is put back in service immediately this way.
{quote}
with free do you mean qd_dealloc or a *real* free? Please help me to understand 
with an example :D
 Anyway I believe that such errors should be exceptional and I would optimize 
the pool for the common case instead ie under normal/correct usage, unless it 
would make it less robust.
{quote}If this change is known to be a faster way to go overall then I won't 
object to it.
{quote}
Good point, that's yet to be proved: I've raised the PR because the change is 
trivial and because is part of a bigger work of making qd_alloc/qd_dealloc 
smarter while avoiding the effort to redesign it, if possible.
 I will try the effects of this change in the next few days (hopefully) and 
come back here to share my findings: maybe it won't make any difference, 
because the bigger cost is to perform data dependent loads due to the linked 
list pointer chasing ie we would just save 1 cache miss, if lucky enough.

Are you aware if in other similar projects we've used a different approach to 
this?
{quote}With the DEQ of objects doesn't adding an object to end of a queue also 
change the pointer at the head of the queue?
{quote}
The only case where a change in the tail modify the head (in this change, but 
same was for the original version)  is when the the free_list is empty or with 
just 1 element (the former when we insert something, the latter while removing 
the only element): the possible benefits (to be proved) are when there are 
already N elements into free_list and these M < N elements are used and 
released to the pool right after eg

 

e_0 =qd_alloc(...) , e_1 =qd_alloc(...), e_2 =qd_alloc(...) ... e_M 
=qd_alloc(...)

//we use e_* and they will remain (worst case) each one on its cache line 

qd_dealloc(e_0), qd_dealloc(e_1), qd_dealloc(e_2) ... qd_dealloc(e_M)

//e_* are yet each one on its cache line 

e_0 =qd_alloc(...) , e_1 =qd_alloc(...), e_2 =qd_alloc(...) ... e_M 
=qd_alloc(...)

// each e_* won't get any cache misses because they were already hot on the 
cache

 

If we use a FIFO ordering the risk is to offer to the pool hot nodes on 
qd_dealloc, but get back colder ones on qd_alloc.

The *real* waste IMHO is:
 # the CPU cannot speculate in either cases after reading one node which one 
will be next (there is no sequential pattern to exploit)
 # the nodes are not packet together on the same cache lines or on adiacent 
cache lines 

My change cannot improve any of these issues, so maybe we won't see evident 
benefits after applying it.

 

 

 

 

 

 

 

> qd_alloc can reuse instances in LIFO order
> ------------------------------------------
>
>                 Key: DISPATCH-1371
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-1371
>             Project: Qpid Dispatch
>          Issue Type: Improvement
>    Affects Versions: 1.8.0
>            Reporter: Francesco Nigro
>            Priority: Minor
>
> qd_dealloc is inserting instances on the tail of the thread local free_list 
> while qd_alloc is reading from the head, always causing cache misses unless 
> free_list size is 1.
> qd_alloc could instead reuse the last inserted instance ie the tail, using 
> free_list as a stack (with LIFO accesses).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to