Re: performance of drlvm

zouqiong Wed, 23 Aug 2006 18:51:39 -0700

Hi, Mikhail,

1. As for the list 2, Chilimbi acutally use GC moving objects to improve the
cache
localty [Profile-guided Proactive Garbage Collection for Locality
Optimization], but without
his algorithm. I will read the paper again. His algorithm mainly focus on
applications in C
or C++. Maybe we can make use of GC to imporve the effect of the algorithm.


2.
I read the Chilimbi`s paper again, and figure out a framework. And I will
discuss about
it here.
Firstly, I want to follow Chilimbi`s Bursty Tracing Framework [Bursty
Tracing: A Framework
for Low-Overhead Temporal Profiling ]. It has two versions of the same code,
one for
check, and another for instrument. And in the instrumented code, it will
record all the
memory access patterns, named Data reference trace. What I want to do is to
modify
his instrumented code, once nCheck equals 0, we will active the Performance
Counter
to profile the cache miss rate and cache miss address of the instrumented
code until
the nInstr equals 0. We set up a threshold, and the instructions with miss
rate larger than
it are delinquent load or delinquent store. As we only active the
Performance Counter in
a short period, it won`t bring much overhead.

Secondly, we abstract the delinquent instructions, output WPS, and then Use
the SEQUITUR
algorithm to process them. The comlexity of the algorithm is O(EL). In
Chilimbi`s paper [
Efficient Representations and Abstractions for Quantifying and Exploiting
Data Reference
Locality.], he reduces the trace by processing the WPS twice. And I think we
can follow
his way.

The mainly difference from Chilimbi`s algorithm is listed as above. There
must be some
deficiency. We can discuss about it.


--------------
Best Regards,
Qiong,Zou

Re: performance of drlvm

Reply via email to