Hi all,
OK. I believe I understand now so I won't go on beating a dead horse. By the
way, Juan from TI just sent me the follow to clarify
"ARM and DSP simultaneous memory access". Hope it helps:
------------------------------------------------------------------------------------------------------------------------------------
Hi Andy,
From a hardware point of view, there is a single DDR2 bus and controller and
hence ARM and DSP cannot access DDR2 in the
same DDR2 cycle simultaneously. However, ARM and DSP are likely to read DDR2
is 32-byte bursts into their respective caches
and due some processing from their caches, such that each is not accessing DDR2
in every possible DDR2 cycle. Hence for most
applications, the DDR2 bandwidth is large enough to accommodate multiple
clients (such as ARM and DSP) without causing major
bottle-necks. Let me know if this clears things up.
-------------------------------------------------------------------------------------------------------------------------------------
Before I sign off, I want to ask a couple questions about the DSP cache. Let's
use the video_copy codec CE server as an example.
Do I need to do anything to get the video_copy CE server to use the available
L1 and L2 cache or is done automatically by the
DSP/BIOS? I know you can allocate the amount of cache (or general purpose
internal memory) to be used in the video_copy.cfg
file but do I need to anything else to get the DSP application to use it.
Thanks.
Regards,
Andy
----- Original Message ----
From: Steve Spano <[EMAIL PROTECTED]>
To: "Griffis, Brad" <[EMAIL PROTECTED]>; Andy Ngo <[EMAIL PROTECTED]>; "Azbell,
Brandon" <[EMAIL PROTECTED]>; [email protected]
Sent: Thursday, January 18, 2007 12:48:58 PM
Subject: RE: Can the ARM and DSP accessing the RAM simulatenously?
Hi Folks,
Maybe to help clarify this a bit from my point of view
1) You can only have one-bit of data from one point in memory on the
traces for the DDR at any time
2) As we know, DDR2 supports many kinds of burst width accesses
3) As Juan states, the software queues read/write requests
4) So, the davinci should be capable – at the hardware level – of
interleaving the burst accesses between the ARM and the DSP side (like every 8
or 16 cycles for example)
Steve Spano
FLE
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Griffis, Brad
Sent: Thursday, January 18, 2007 3:13 PM
To: Andy Ngo; Azbell, Brandon; [email protected]
Subject: RE: Can the ARM and DSP accessing the RAM simulatenously?
Juan is discussing this from a Linux threads perspective, not the actual
interface. As Brandon mentioned, there is only one bus and it is impossible
for two sets of data to be on the pins simultaneously.
From: Andy Ngo [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 18, 2007 1:50 PM
To: Griffis, Brad; Azbell, Brandon;
[email protected]
Subject: Re: Can the ARM and DSP accessing the RAM simulatenously?
Hmm..., I just received an answer from a TI expert that says that the ARM and
DSP can still access the DDR simultaneously as long each is accessing the DDR
memory partition dedicated to it (in other words, as long as they are not
accessing CMEM at the same time). Here's the response I received from TI.
From what
he said, the ARM and DSP don't block each other if they access their own
dedicated DDR space, even from the same physical chip. Am I
missing something?
Regards,
Andy Ngo
----------------------------------------------------------------------------------------------------------------------------------------------------------
Hi Andy,
To answer your question, let me point out that the DDR2 controller has a
theoretical maximum data throughput of 1296 MByte/sec; tests we have conducted
actually show
~95% utilization ( 1240 MBytes/sec). That said, ARM, DSP, EDMA, VPSS and
master peripherals can access DDR2 (not all may be used in your particular
design).
There is default prioritization scheme as to which client gets it, but you can
alter this via DM6446 registers.
As your hardware engineer pointed out, all accesses to DDR2 will go through one
controller and a common bus. However, I do not believe this will cause a
bottleneck
issue for you. To answer your specific question, Linux (on ARM) is a
multithreaded system and BIOS (on DSP) is a single threaded system. We already
know that
calls to DSP are queued, must be done from a common Linux thread, and are
blocking (meaning specific Linux thread is blocked until DSP finishes
processing call).
However, you can have other Linux threads free to access DDR2 in the mean
time. In part, the memory map you described serves to separate ARM DDR2 space
from DSP DDR2 space and also defines CMEM (contiguous DDR2 memory shared by
both ARM and DSP). This means that any DSP algorithm can access DSP
DDR2 space without ARM caring about it and similarly ARM can access ARM DDR2
space without DSP caring about it. The only time they block each other is
when ARM calls on DSP to process buffer from CMEM space and only the Linux
thread calling DSP is blocked. In all other cases, so long as there is DDR2
bandwidth left (and chances are there will be), both ARM and DSP can access
DDR2 without concern.
That said, how much DDR2 bandwidth do you estimate you will need to access at
peak loading?
Let me know if this helps clear things up and if there is anything else we can
assist you with.
Best Regards,
Juan Gonzales
DSP Applications
Texas Instruments
Semiconductor Technical Support
http://www-k.ext.ti.com/sc/technical_support/pic/americas.htm
----- Original Message ----
From: "Griffis, Brad" <[EMAIL PROTECTED]>
To: Andy Ngo <[EMAIL PROTECTED]>; [email protected]
Sent: Thursday, January 18, 2007 9:27:36 AM
Subject: RE: Can the ARM and DSP accessing the RAM simulatenously?
Andy,
You are correct that if you didnʼt use any cache that only one processor could
do an instruction fetch at a given time. For that very reason having the cache
disabled would be a poor design decision.
In general you shouldnʼt need multiple external memories to run both ARM and
DSP code. We have plenty of examples that do it, e.g. H.264 encode at D1 res
and 30fps. The ARM and the DSP each have their own cache. In fact, the DSP
has two-level cache (L1P, L1D, and L2). Of course there are limits to what the
cache will buy you. It works the best when you have a lot of re-use of
code/data.
Brad
_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source