Question about CPU caches and D context pointers

Etienne Mon, 17 Feb 2014 19:22:22 -0800

I've had his question at the back of my mind and I know it'sprobably related to back-end optimizations but I'm taking achance to see if anyone knows anything.

I know everything about how insignificant the speed differencemay be, but keep in mind this is to further my low-levelunderstandings. Here's an example to illustrate the questionbecause it's quite complicated (to me):


#1 contextual function
struct Contents {
        ubyte[] m_buffer;
        
        this(){
                m_buffer = new ubyte[4092];
        }
        
        rcv(string str){
                m_buffer ~= str;
        }
        
        flush(){
                send_4092_bytes_of_data_to_final_heap_buffer()
                m_buffer.reset();
        }
        
}

vs..
#2 context-less function

rcv(string str){
        send_small_bytes_of_data_to_final_heap_buffer(str);
}

The first case is the struct. When entering rcv() function, Iknow the pointer and length of m_buffer are on the stack at thatpoint. That's pretty damn fast to access b/c the CPU caches keepthese at level 1 through the whole routine. However, It's notobvious to me if the memory where m_buffer points to will stay inthe CPU cache if there's 5 consecutive calls or so to this sameroutine in the same thread. Also note, it will flush to anotherbuffer, so there's more heap roundtrips with buffers if the CPUcache isn't efficient.

The second case (context-less) just sends the string rightthrough to the final allocation procedure (another buffer), andthe string stays a function parameter so it's on the stack, thusin the CPU cache through every call frame until the malloc takesplace (1 heap roundtrip regardless of any optimization).

So, would there be any chance for the m_buffer's pointee regionto stay in the CPU cache if there's thousands of consecutivecalls to the struct's recv, or do I forcefully have to keep thedata on the stack and send it straight to the allocator? Is therean easy way to visualize how the CPU cache empties or fillsitself, or to guarantee heap data stays in there without usingthe stack?

I'm sorry if the question seems complicated, I read everythingUlrich Drepper had to say in What every programmer should knowabout memory, and I still have a bit of a hard time visualizingthe question myself.

Question about CPU caches and D context pointers

Reply via email to