Nope, that didn't help.  I tried allocating 1024 bytes for the output buffer (I 
only needed 105 bytes as stated
in my previous email) and I still have the stale data problem unless I 
explicitly do a writeback
invalidate.  So that eliminates the "integral number of cache lines" 
requirement as the culprit
for my cache problem.  I used Memory_contigAlloc so that should take care of 
the contiguous
requirement, how do you ensure the cache-line alignment requirement?

My DSP codec is not writing to the IN buffers and the OUT buffers are not 
written to by my ARM application
so that eliminates Scott's points 2 and 3 as the culprit.

Regards,
Andy

----- Original Message ----
From: Andy Ngo <[EMAIL PROTECTED]>
To: "Gary, Scott" <[EMAIL PROTECTED]>; Adam Dawidziuk <[EMAIL PROTECTED]>
Cc: "davinci-linux-open-source @linux.davincidsp.com" 
<[email protected]>
Sent: Thursday, March 1, 2007 10:02:59 PM
Subject: Re: Cache coherency issue?

Scott,

Thanks for your response.  On the ARM side, I used  Memory_contigAlloc to 
allocate the input and output buffers; 
doesn't  Memory_contigAlloc automatically handle the contiguous and cache-line 
alignment requirements?  You
say the buffers must be sized as an integral number of cache lines; how do I 
know what is that "number of
cache lines" or where do I find that out?  I think the "integral number" may be 
my problem since I only allocate
just enough data that I used for the output buffer; the size of my output 
buffer is 105 bytes, which I doubt is an
"integral number".  Thanks for your advice, I'll try it out
 soon and let you know.

Regards,
Andy

----- Original Message ----
From: "Gary, Scott" <[EMAIL PROTECTED]>
To: Andy Ngo <[EMAIL PROTECTED]>; Adam Dawidziuk <[EMAIL PROTECTED]>
Cc: "davinci-linux-open-source @linux.davincidsp.com" 
<[email protected]>
Sent: Thursday, March 1, 2007 8:15:51 PM
Subject: RE: Cache coherency issue?



 
DIV {
MARGIN:0px;}




Andy,

 

The framework should indeed be handling the 
necessary buffer invalidates before, and writebacks after, calling the 
algorithm's process function.  

 

To verify this you could try turning on trace, (as in 
the archived message), and you should see similar statements indicating the 
operations on the specific buffers.  The mask name for these memory 
calls is "OM".  If you are using TraceUtil, you can 
specify CE_TRACE="OM=012".

 

I don't know if they apply, but some general things to 
be careful of:

- Make sure your buffers are contiguous, cache-line 
aligned, and sized as an integral number of cache lines.

- DSP should not write to IN 
buffers.

- OUT buffers should not be used to pass data to 
DSP.

 

Hope this helps.

 

Regards,

Scott




  
  
  From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf 
  Of Andy Ngo
Sent: Thursday, March 01, 2007 1:54 PM
To: 
  Andy Ngo; Adam Dawidziuk
Cc: davinci-linux-open-source 
  @linux.davincidsp.com
Subject: Re: Cache coherency 
  issue?



  

  
  OK, I 
  got desperate and need to get my codec work correctly right away.  So I 
  tried forcing a write-back invalidate on the 
output buffers and now 
  everything seems to work fine; before SPHENC_process returns, I called the 
  following to write-back
invalidate the output 
  buffers:

        
  Memory_cacheWbInv(outBufs->bufs[0], size);

I don't know why the 
  XDAIS framework layer is not doing that automatically for me.  Any 
  thoughts anyone?

Thanks to X. Zhou 
(http://www.mail-archive.com/[email protected]/msg00960.html)
for 
  bringing this up (and still, no one has really given a clear 
  answer).

Regards,
Andy


  ----- 
  Original Message ----
From: Andy Ngo 
  <[EMAIL PROTECTED]>
To: Adam Dawidziuk 
  <[EMAIL PROTECTED]>
Cc: "davinci-linux-open-source 
  @linux.davincidsp.com" 
  <[email protected]>
Sent: Wednesday, 
  February 28, 2007 10:46:21 PM
Subject: Re: Cache coherency issue?


  
  Adam,

Thanks 
  for the quick response.  I'm not quite sure what you are saying.  I 
  basically took the example 
sphenc_copy codec as a template and customized 
  it to create my own speech enc codec.   All
the configurations 
  (contents in .tcf and .cfg) pretty much stayed the same.  My codec 
  algorithm does not
explicitly call any DMA functions so I'm not sure how 
  I'm "accessing the output buffers both
by the CPU and DMA".  I added a 
  lot of code and data to the codec so the only change I can
see is the 
  increase in memory usage of the DDR region (4MB code, stack, static 
  data).  How can
I debug this issue to see if the DSP and DMA are 
  accessing the output buffers as you suggested?
Like you said, this seems 
  like a caching problem.  I guess I'll try to handle the cache coherency 
  myself,
which the Framework is supposed to do for me 
  automatically.

Anyone, any 
thoughts.

Regards,
Andy



  ----- 
  Original Message ----
From: Adam Dawidziuk 
  <[EMAIL PROTECTED]>
To: Andy Ngo 
  <[EMAIL PROTECTED]>
Cc: "davinci-linux-open-source 
  @linux.davincidsp.com" 
  <[email protected]>
Sent: Wednesday, 
  February 28, 2007 9:09:49 PM
Subject: Re: Cache coherency issue?


  Andy,

Don't get me wrong by personally I think you don't exactly 
  follow DMA 
rules in your algorithm. It seems that for some reason you 
  access the 
output buffers both by CPU and DMA. Thus some part of data are 
  left in 
cache, and some are in external memory. GT_trace probably access 
  all 
your data by CPU thus performing automatic write-back when cache runs 
  out.
Are you 100% sure your data is coherent upon returning from the 
  process 
call, presumably all in external memory?

Hope you figure 
  out the way and share this with community. I would 
certainly want to see 
  what's going on, sine I had strange cache 
coherency problems 
  myself...
Best,

Andy Ngo wrote:
> According to a previous post 
  
> (http://www.mail-archive.com/[email protected]/msg00960.html),
> 
  the XDAIS Framework is suppose to handle cache coherency automatically 
  
> (points 1-4 in the post above).  Recently, I have
> 
  been adding more and more code and data to my DSP speech codec and I've 
  
> been getting weird problems with the data exchanged
> between 
  the ARM and the DSP.  For example, I would always get the same 
  
> exact data on the output buffer from a call
> to 
  SPHENC_process.  In attempt to debug the problem, I put a GT_trace 
  
> call in my DSP speech codec to print out the data
> that was 
  being returned from SPHENC_process so that I can compare it to 
> the 
  data I saw being returned to on the ARM side.
> Weird thing is that by 
  putting GT_trace in the speech codec, the problem 
> went away (the 
  return data is different each time).
> As soon as I comment out the call 
  to GT_trace, the problem came back 
> (ARM side sees same data being 
  returned).
>  
> Am I doing something 
  wrong?  Is there a cache cohency problem here?  Why 
  
> does adding a simple GT_trace fix the problem and
> my data 
  looks correct?  Instead of GT_trace, I tried putting some hard 
  
> delays in an attempt to affect timing but that wouldn't
> work, 
  only a call to GT_trace work.  I've been on this for several days 
  now.
>  
> Please advise.  Thanks in 
  advance.
>  
> Regards,
> Andy
> 
> 
  
> 
  ------------------------------------------------------------------------
> 
  
> _______________________________________________
> 
  Davinci-linux-open-source mailing list
> 
  [email protected]
> http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


-- 
  
Adam Dawidziuk
Sentivision






  _______________________________________________
Davinci-linux-open-source 
  mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source









_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source




_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source

Reply via email to