BTW, I posted a response to Zhou's cache question.  I think he got a
copy, but no one else did because of the notification below.  I've
attached my response, modulo the pdf... we'll see if the moderator
ultimately approves.

List members, is this 100KB limit "too low"?

Chris 

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, November 25, 2006 9:41 AM
> To: Ring, Chris
> Subject: Your message to Davinci-linux-open-source awaits 
> moderator approval
> 
> Your mail to 'Davinci-linux-open-source' with the subject
> 
>     RE: cache coherency problem
> 
> Is being held until the list moderator can review it for approval.
> 
> The reason it is being held:
> 
>     Message body is too big: 303472 bytes with a limit of 100 KB
> 
> Either the message will get posted to the list, or you will receive
> notification of the moderator's decision.  If you would like to cancel
> this posting, please visit the following URL:
 
<chop>
--- Begin Message ---
I think you understand the cache issue correctly.  Namely:
   *  The data buffers must not be cached on the ARM side.
      * Buffers acquired through CMEM (or preferably CE's
Memory_allocContig()) _are_ non-cached on the ARM side - They're mapped
into the calling process's space via ioremap_nocache().
 
   *  The data buffers _are_ typically cached from the DSP side, and
maintenance of the cache coherency is placed on the Codec Engine
skeletons.  The skeletons are the "remote" side of the RPC mechanism,
and explained in the Algorithm Creator's Guide as Jerry Johns previously
mentioned.
      * For the 8 VISA interfaces which Codec Engine provides
stubs/skeletons, the Codec Engine skeletons perform this cache
maintenance.  Namely, invalidate input buffers before the process()
call, and writebackInvalidate the output buffers after the process call.
 
And a final, strange constraint... the buffers _should_ be aligned on a
cache boundary.  On DaVinci, this cache boundary is 128 bits.  Cache
maintenance is performed on these cache boundaries; if the buffers
aren't aligned on this boundary, any overlapping buffer that's in the
same "cache page" as a working data buffer could be corrupted.
 
[ The VISA stubs/skeletons _used_ to check for this 128-bit alignment
and fail if this constraint wasn't met, but "smart" applications who
know about all this cache maintenance can be written correctly (I won't
go into that here) so this check was removed. ]
 
Another thing to look out for is if the _algorithm_ _writes_ into the
"input" buffers.  This is against the xDM spec, but can result in
strange data corruption of subsequent input buffers.  Because the input
buffers are not cache maintained _after_ the process() call (because,
per the xDM spec, they don't need to be), any writes to them will be
sitting in the cache lines, and my be written back "some time later" as
the DSP cache hardware decides.  Therefore, a mis-behaving codec who
writes into input buffers can corrupt the ARM-side view of data in
strange, "late" ways.
 
Which codec are you using?  It it a TI codec?
 
I've attached a snapshot of a TI internal TWiki topic that might help as
well.  (I hope it goes through - there are rumors of the list rejecting
files larger than 100kB... we'll see...)
 
Chris


________________________________

        From: X. Zhou [mailto:[EMAIL PROTECTED] 
        Sent: Thursday, November 23, 2006 9:27 PM
        To: Ring, Chris; X. Zhou
        Cc: [email protected]
        Subject: cache coherency problem
        
        
        Hi, 
        This nasty problem got me mad!!
         
        I am now using DVEM6446 board and DVSDK enviroment to develop a
ARM client + DSP video decoder application.
         
        Now i found that sometimes the bitstream buffer which is
transferred by ARM to DSP exists dirty data case.
         
        Detail information about this case were given here:
         
        (1) the bitstream buffer was pre-allocated at ARM side, via
Memory_contigAlloc() function;
            
            [ in my opinion, Memory_contigAlloc() should provide buffers
which were not only aligned, 
              but also non-cached and physically contiguous.  
            
              Isn't it right?? 
            ]   
         
        (2) each time when i call the VIDDEC_process() interface at ARM
side, i pass the bitstream buffer 
            pointer via the  "XDM_BufDesc inBufs" parameter. e.g.:
            
            streamBuf = Memory_contigAlloc(800000, -1);
            
            ................
            
            actualStreamSize = fread(streamBuf, 1, 800000, fp);
            
            ................
            
            inBufs.numBufs = 1;
            inBufAddr[0] = streamBuf;
            inBufSize[0] = actualStreamSize;
            inBufs.bufs[0] = inBufAddr;
            inBufs.bufSizes[0] = inBufSize;
            
            printf("ARM : 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
                           *((unsigned char *)inBufAddr[0]+0),
                           *((unsigned char *)inBufAddr[0]+1),
                           *((unsigned char *)inBufAddr[0]+2),
                           *((unsigned char *)inBufAddr[0]+3),
                           *((unsigned char *)inBufAddr[0]+4),
                           *((unsigned char *)inBufAddr[0]+5));
            
             /*  Call the process function to decode the nalu buffers */
            status = VIDDEC_process(hdecode,
                                    &inBufs,
                                    &outBufs,
                                    (IVIDDEC_InArgs *)(&decoder_inargs),
                                    (IVIDDEC_OutArgs
*)(&decoder_outargs));
                                    
             ................
                                         
        (3) each time when the dsp-side IVIDDEC_process() function is
called, i invaliate the cache related to 
            bitstream buffer. e.g:
            
            XDAS_Int32 IVIDDEC_process(IVIDDEC_Handle h, XDM_BufDesc
*inBufs, XDM_BufDesc *outBufs, 
                                       IVIDDEC_InArgs *inArgs,
IVIDDEC_OutArgs *outArgs)
            {
                int i;
                for ( i = 0; i < inBufs->numBufs; i ++ )
                {
                    GT_6trace(curDecTrace, GT_ENTER, "DSP : before
invalid: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
                           *((unsigned char *)inBufs->bufs[i]+0),
                           *((unsigned char *)inBufs->bufs[i]+1),
                           *((unsigned char *)inBufs->bufs[i]+2),
                           *((unsigned char *)inBufs->bufs[i]+3),
                           *((unsigned char *)inBufs->bufs[i]+4),
                           *((unsigned char *)inBufs->bufs[i]+5));
         
                    BCACHE_inv(inBufs->bufs[i], inBufs->bufSizes[i],
TRUE);    //invaildate cache 
                
                    GT_6trace(curDecTrace, GT_ENTER, "DSP : after
invalid: 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n",
                           *((unsigned char *)inBufs->bufs[i]+0),
                           *((unsigned char *)inBufs->bufs[i]+1),
                           *((unsigned char *)inBufs->bufs[i]+2),
                           *((unsigned char *)inBufs->bufs[i]+3),
                           *((unsigned char *)inBufs->bufs[i]+4),
                           *((unsigned char *)inBufs->bufs[i]+5));
                 }
                 
                 .............
                 
                 decode_one_frame(inBufs, .....);
                 
                 ............
            }     
         
        

          While, the experiment results show that sometimes the data
readden by DSP are inconsistent with 
          the datat written by ARM, and if inconstistent case exists, it
seems that the data readden by dsp 
          in this time is the same data with the data written by ARM
last time, 
          
          How to fix this problem?
          
          Is the cache on ARM side unflushed (if Memory_contigAlloc()
provides buffers with cachalility) ?
          or is the cache on DSP side ninvalidated unsucessfully (if
BCACHE_inv() is a void function )?
          
          I got mad! Help me, please?!!!
          
          
                  
                           
         
            
            
           
              
         
This message (including any attachments) is for the named addressee(s)'s
use only. It may contain
sensitive, confidential, private proprietary or legally privileged
information intended for a
specific individual and purpose, and is protected by law. If you are not
the intended recipient,
please immediately delete it and all copies of it from your system,
destroy any hard copies of it
and notify the sender. Any use, disclosure, copying, or distribution of
this message and/or any
attachments is strictly prohibited.


        


--- End Message ---
_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source

Reply via email to