"Demux_factor" was added so that you don't have huge buswidths.
Instead of outputting four parameters of complex data (XX, YY, XY,
YX), typically 128 bits wide (for 4bits in and acc_len of 128), you
can demux by 1,2,4 or 8 so that you can reduce this down to
128/8=16bits if you want. Useful if your long-term vacc has a narrower
buswidth (as is the case with the QDR). It doesn't cost much, cos you
can leverage the descrambler's dual-ported BRAM's ability to have
different port widths.
I'll also note that if you're processing single-pol data, you might
want to reduce the length of the X engines by a factor of 2 and use
the second pol for another antenna. Otherwise you're wasting a lot of
resources in the X engines. The other pols will not get stripped-off
in the green block if you don't use 'em because it's feeding into a
BRAM (the descrambler).
Jason
On 31 Dec 2009, at 00:45, Aaron Parsons wrote:
Dear Glenn (cc Jason, CASPER),
Here are some answers to you questions, and a preview of some plans I
had for the X engine block at the end.
I am looking at the r_4f_2x_16a_r322b design for guidance.
Since we have n_freqs * n_ants data units going into the X engine and
something like
n_freqs * n_ants * (n_ants + 1) / 2 data units coming out, it's
clear that
we need to integrate for at least n_ants/2 internally to the x
engine.
From the block documentation, I think the data order I need to feed
the X
engine is as follows:
for freq in range(256):
for antenna in range(8):
for timestep in range(accumulation_length):
data(antenna, timestep, freq)
is that about right?
This is correct
Is the only impact of the number of frequency channels on the
accumulation
length?
The number of freq channels has no impact on the accumulation length.
For a fixed bandwidth from the antenna, the bandwidth into an X engine
doesn't depend on number of channels. This is because each X engine
processes many frequencies; if the number of channels is increased,
then each X engine must process more frequency channels, but those
channels come less frequently. The real constraints on accumulation
length are that for a "Nacc" chunk of samples into an X engine, the
Nant/2 products that result from that chunk must be output before the
next Nacc chunk comes, and the aggregate bandwidth out of the X engine
must be low enough to go into a longer-term accumulator.
I notice in the reference design, the number of antennas is 16 and
the acc length is 128, which happens to give16*128 = 2048, the
number of
frequency channels. Is this coincidence, or a requirement.
coincidence
What exactly is Demux_factor? In the reference design I notice that
it is
set to 8 and the design also mentions 8 X engines in the system...
I think since I have 4 complex data streams in my design, I'll need
4 X
engines, so do I need demux_factor to be 4?
I'm not sure what this demux_factor is that you are talking about, but
your computation of the # of X engines is correct.
The X engine is set up for 2 polarizations. In my case there is
only one
polarization, so I plan to tie the other polarization to 0 and
leave the
unused polarization outputs disconnected. Is that the right thing
to do?
That should be right. As long as you don't use the outputs, the other
pols will get stripped off.
Something to add to the above discussion is that I've black-boarded
(that is, drawn on the black board in my office) a new design for an X
engine that, for a window of data in, produces a contiguous window of
data out. This should remove the need for the X engine unscrambler
block and simplify the correlator design a little bit. I am also
proposing, as part of this revision, that we move to more heavily
multiplexing the accumulated output from X engines. This involves
doing away with parallel polarizations (they should just be treated as
separate antennas), and having real/imag samples multiplexed onto the
same bus. This should reduce to some extent the logic used in
handling and re-multiplexing these wide busses. Any objections?
--
Aaron Parsons
510-406-4322 (cell)
Campbell Hall 523, UCB