Jason,

Fair point. One of our guys is currently trying to get ChipScope configured
to make sure all our control signals are correct. We'll definitely look at
that signal too. Hopefully that will finally put this issue to rest.

Thanks for the tip,

Richard Black

On Mon, Oct 27, 2014 at 10:47 AM, Jason Manley <jman...@ska.ac.za> wrote:

> Yep, ok, so whoever did it (Dave?) already knows about this issue and has
> dealt with it. So scratch that idea then! Only other thing to check is to
> make sure you don't actually toggle that software register until the core
> is configured.
>
> Jason Manley
> CBF Manager
> SKA-SA
>
> Cell: +27 82 662 7726
> Work: +27 21 506 7300
>
> On 27 Oct 2014, at 18:38, Richard Black <aeldstes...@gmail.com> wrote:
>
> > By "enable" port, I assume you mean the "valid" port. I've been looking
> at the PAPER model carefully for some time now, and that is how it
> operates. It has a gated valid signal with a software register on each
> 10-GbE core.
> >
> > Once again, this is not our model. This is one made available on the
> CASPER wiki and run without modifications.
> >
> > Richard Black
> >
> > On Mon, Oct 27, 2014 at 10:34 AM, Jason Manley <jman...@ska.ac.za>
> wrote:
> > I suspect the 10GbE core's input FIFO is overflowing on startup. One key
> thing with this core is to the ensure that your design keeps the enable
> port held low until the core's been configured. The core becomes unusable
> once the TX FIFO overflows. This has been a long-standing bug (my emails
> trace back to 2009) but it's so easy to work around that I don't think
> anyone's bothered looking into fixing it.
> >
> > Jason Manley
> > CBF Manager
> > SKA-SA
> >
> > Cell: +27 82 662 7726
> > Work: +27 21 506 7300
> >
> > On 27 Oct 2014, at 18:25, Richard Black <aeldstes...@gmail.com> wrote:
> >
> > > Jason,
> > >
> > > Thanks for your comments. While I agree that changing the ADC
> frequency mid-operation is non-kosher and could result in uncertain
> behavior, the issue at hand for us is to figure out what is going on with
> the PAPER model that has been published on the CASPER wiki. This naturally
> won't be (and shouldn't be) the end-all solution to this problem.
> > >
> > > This is a reportedly fully-functional model that shouldn't require any
> major changes in order to operate. However, this has clearly not been the
> case in at least two independent situations (us and Peter). This begs the
> question: what's so different about our use of PAPER?
> > >
> > > We, at BYU, have made painstakingly sure that our IP addressing
> schemes, switch ports, and scripts are all configured correctly (thanks to
> David MacMahon for that, btw), but we still have hit the proverbial brick
> wall of 10-GbE overflow.  When I last corresponded with David, he explained
> that he remembers having a similar issue before, but can't recall exactly
> what the problem was.
> > >
> > > In any case, the fact that by turning down the ADC clock prior to
> start-up prevents the 10-GbE core from overflowing is a major lead for us
> at BYU (we've been spinning our wheels on this issue for several months
> now). By no means are we proposing mid-run ADC clock modifications, but
> this appears to be a very subtle (and quite sinister, in my opinion) bug.
> > >
> > > Any thoughts as to what might be going on?
> > >
> > > Richard Black
> > >
> > > On Mon, Oct 27, 2014 at 2:41 AM, Jason Manley <jman...@ska.ac.za>
> wrote:
> > > Just a note that I don't recommend you adjust FPGA clock frequencies
> while it's operating. In theory, you should do a global reset in case the
> PLL/DLLs lose lock during clock transitions, in which case the logic could
> be in a uncertain state. But the Sysgen flow just does a single POR.
> > >
> > > A better solution might be to keep the 10GbE cores turned off (enable
> line pulled low) on initialisation, until things are configured (tgtap
> started etc), and only then enable the transmission using a SW register.
> > >
> > > Jason Manley
> > > CBF Manager
> > > SKA-SA
> > >
> > > Cell: +27 82 662 7726
> > > Work: +27 21 506 7300
> > >
> > > On 25 Oct 2014, at 10:34, peter <peterniu...@163.com> wrote:
> > >
> > > > Hi Richard,Joe,& all,
> > > > Thanks for your help,It finally can receive packets now!
> > > > As you point,After enabled the ADC card and run bof
> file(./adc_init.rb roach1 bof file)in 200 Mhz (or higher than it), We need
> run init fengien script in about 75 Mhz ,(./paper_feng_init.rb roach1:0 )
> ,That will allow the packet transfer.  then we can turn the frequency
> higher.However the finally ADC clock frequency is up to 120 Mhz in my
> experiment.Our final ADC frequency standard is 250 Mhz. Maybe I need run
> the bof file in a higher ADC frequency first to make a final steady 250 Mhz
> ADC clock frequncy.
> > > > Why it need init in a lower frequency and turn it up? That didn't
> make sense.Is the hardware going wrong?As the yellow block adc16*250-8 is
> designed for 250 Mhz, it should be ok for 200Mhz or 250 Mhz.How about the
> final frequency in your experiment?
> > > > Any reply will be helpful!
> > > > Best Regards!
> > > > peter
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2014-10-25 00:36:52, "Richard Black" <aeldstes...@gmail.com>
> wrote:
> > > > Peter,
> > > >
> > > > That's correct. We downloaded the FPGA firmware and programmed the
> ROACH with the precompiled bitstream. When we didn't get any data beyond
> that single packet, we stuck some overflow status registers in the model
> and found that we were overflowing at 1025 64-bit words (i.e. 8200 bytes).
> > > >
> > > > We have actually found a way to get packets to flow, but it isn't a
> good fix. When we turn the ADC clock frequency down to about 75 MHz, the
> packets begin to flow. There is an opinion in our group that the 10-GbE
> buffer overflow is a transient behavior, and, hence, if we slowly turn up
> the clock frequency after the ROACH has started up, packets may continue to
> flow in steady-state operation. We haven't tested this yet, though.
> > > >
> > > > Richard Black
> > > >
> > > > On Thu, Oct 23, 2014 at 8:39 PM, peter <peterniu...@163.com> wrote:
> > > > Hi Richard,& All,
> > > > As you said the size of isolate packet is changing every time. ) :
> > > > tcpdump: verbose output suppressed, use -v or -vv for full protocol
> decode
> > > > listening on px1-2, link-type EN10MB (Ethernet), capture size 65535
> bytes
> > > > 10:10:55.622053 IP 10.10.2.1.8511 > 10.10.2.9.8511: UDP, length 4616
> > > > Ddi you download the PAPER gateware on the casper  (
> https://casper.berkeley.edu/wiki/PAPER_Correlator_Manifest ) directly?
> How about the PAPER bof file run on your system? Have you met overflow
> before?I download and install  PAPER model as the website says ,but the
> overflow shows when I run the paper_feng_netstat.rb.
> > > > Thanks for your information.
> > > > peter
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2014-10-24 09:59:12, "Richard Black" <aeldstes...@gmail.com>
> wrote:
> > > > Peter,
> > > >
> > > > I don't mean to hijack your thread, but we've been having a very
> similar (and time-absorbing) issue with the PAPER f-engine FPGA firmware
> here at BYU. Out of curiosity, does this single packet that you're
> receiving in tcpdump change in size every time you reprogram the ROACH?
> We've seen this happen, and we're pretty sure that this isolated packet is
> the 10-GbE buffer flushing when the 10-GbE core is initialized (i.e. the
> enable signal isn't sync'd with the start of new packet).
> > > >
> > > > Regardless of whether we have the same issue, I'm very interested to
> see this problem's resolution.
> > > >
> > > > Good luck,
> > > >
> > > > Richard Black
> > > >
> > > > On Thu, Oct 23, 2014 at 7:50 PM, peter <peterniu...@163.com> wrote:
> > > > Hi Joe, & All,
> > > > I find a thing this morning , there is one packet send out from
> roach When I run PAPER model, which I got from HPC tcpdump:
> > > > tcpdump: verbose output suppressed, use -v or -vv for full protocol
> decode
> > > > listening on px1-2, link-type EN10MB (Ethernet), capture size 65535
> bytes
> > > > 09:04:02.757813 IP 10.10.2.1.8511 > 10.10.2.9.8511: UDP, length 6456
> > > >
> > > > The lenght is not expected 8200+8 ,and far from full TX buffer size
> 8K+512.And the other packets are stopped from overflow.
> > > > I have tried to change the tutorial 2 packet size to 8200 bytes and
> 8K +512 bytes. It is  a good transfer.I also make sure the boundary size is
> indeed 8K+512 ,because while I change size to 8K+513 byetes ,There is no
> data send.So the received packet this morning with length 6456  is totally
> under the limit.But what caused the other packets  in overflow?
> > > > Any suggestions could be helpful !
> > > > peter
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2014-10-24 00:37:14, "Kujawski, Joseph" <jkujaw...@siena.edu>
> wrote:
> > > > Peter,
> > > >
> > > > By cadence of the broadcast, I mean how often are the 8200 byte
> packets sent.  Basically, I would like to determine how close your system
> is to the maximum data rate of the 10Gbe.
> > > >
> > > > Also, it would be instructive to know the following:
> > > >
> > > > 1) What transmission protocol are you using? (the One_GBe module
> uses UDP are you using that or TCP?)
> > > >
> > > > 2) What NICs are you using on the receive side?
> > > >
> > > > At this time, I am working on the theory that the issue is related
> to the network itself not being able to sustain the data volume you are
> generating and would like to get a better idea of how much data is
> generated and how often it is sent.
> > > >
> > > > Thanks,
> > > >
> > > > -Joe Kujawski
> > > >
> > > >
> > > >
> > > > On Thu, Oct 23, 2014 at 12:01 PM, peter <peterniu...@163.com> wrote:
> > > > hi Joe,
> > > > 1,yes ,acctually we have 3 roach2 with 8 nics.
> > > > 2,well,each roach has 4 of 8 NICs connect directly to pc.the other 4
> connect 10gb switch.I have connected the sfp wire( whitch should connect
> switch)  to pc directly to see whwther the data come out.but no data out as
> for the overflow.
> > > > 3 could you make an example about the cadence broadcast?I am not
> familiar with this.
> > > > it indeed require bigger data,but each packet has the limited 8200
> bytes.
> > > > thanks for your reply!
> > > > peter
> > > > --
> > > > 发自 Android 网易邮箱
> > > >
> > > >
> > > >
> > > > On 2014-10-23 23:16 , Kujawski, Joseph Wrote:
> > > >
> > > > Peter,
> > > >
> > > > I am downloading it now.  Can you answer these questions:
> > > >
> > > > 1) Do you have a standard PAPER architecture with two ROACH boards
> each containing 8 10GBe ports?
> > > >
> > > > 2) Please describe your internet architecture i.e. how are each of
> the ports connected.
> > > >
> > > > 3) What is the cadence of each broadcast?
> > > >
> > > > My current suspicion is that you are generating more data than you
> can push through your interface(s).  It may be that the higher data volume
> in your implementation requires more of a network infrastructure than was
> required byt the original system.
> > > >
> > > > -Joe Kujawski
> > > >
> > > > On Thu, Oct 23, 2014 at 11:01 AM, peter <peterniu...@163.com> wrote:
> > > > This is a littel big, roach2_tl8511port is the one can send data
> normally.The environment should be ok now ,Iast time the crc32x64_con may
> be missing.
> > > > Good night!
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2014-10-23 22:52:54, "Kujawski, Joseph" <jkujaw...@siena.edu>
> wrote:
> > > > Peter,
> > > >
> > > > 1) For reference, here is a list of the errors:
> > > >
> > > > --------------------------------- Version Log
> ----------------------------------
> > > > Version                                 Path
> > > > System Generator 14.6
>  C:/Xilinx/14.6/ISE_DS/ISE/sysgen
> > > > Matlab 8.0.0.783 (R2012b)               C:/MATLAB/R2012b
> > > > ISE                                     C:/Xilinx/14.6/ISE_DS/ISE
> > > >
> --------------------------------------------------------------------------------
> > > > Summary of Errors:
> > > > Error 0001: Could not find the configuration m-function
> "crc32x64_con...
> > > >      Block:
> 'roach2_fengine_tl8511port/transpose/Transpose1/crc/crc32x64'
> > > > Error 0002: Could not find the configuration m-function
> "crc32x64_con...
> > > >      Block:
> 'roach2_fengine_tl8511port/transpose/Transpose2/crc/crc32x64'
> > > > Error 0003: Could not find the configuration m-function
> "crc32x64_con...
> > > >      Block:
> 'roach2_fengine_tl8511port/transpose/Transpose3/crc/crc32x64'
> > > > Error 0004: Could not find the configuration m-function
> "crc32x64_con...
> > > >      Block:
> 'roach2_fengine_tl8511port/transpose/Transpose4/crc/crc32x64'
> > > >
> --------------------------------------------------------------------------------
> > > >
> > > > 2) Your email did not have an attachment.  I have more comments, but
> wanted to let you know about the attachment before you went to bed.
> > > >
> > > > -Joe Kujawski
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Oct 23, 2014 at 10:33 AM, peter <peterniu...@163.com> wrote:
> > > >
> > > > Hi Joe,
> > > > Thanks for your warm help!
> > > > What error  shows when you compile my model?Is there some file it
> missed? I will packet my whole file to you in the attachment. And how about
> the PAPER one ?Did it report overflow message? It need to install and use
> the ruby to control it .
> > > > Leave the PAPER model alone, Let's talk about the 10Gb block on
> roach v2. Though your model is good to see the Data_valid and eof etc.  I
> don't know how to add your model to the PAPER as I realize the PAPER have a
> data valid and EOF according to a counter.So I don't know where to put the
> model.For example,if I put the data_valid or eof control process you
> designed on the 10Gbe port in PAPER model,then I think it equal to add a
> 10Gbe block instead One_GBe block in yours. *_*!!
> > > > I change the number 50 to 1025 on tutorial 2 to make packet size to
> 8200 bytes ,And it seems good transfer without error.it is a  frequency
> 1.3*1025. that means 1 packet send every 1.3*1025 clock.I got the boundary
> frequency 1.3*1025 by test a lot of times.  but when I change the frequency
> lower than 1.3*1025,the first few packets can send out,but the overflow
> comes.I think it is the transfer frequency that determined the overflow.
> > > > Thanks for your suggestions and advice!
> > > > peter
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > At 2014-10-23 00:14:29, "Kujawski, Joseph" <jkujaw...@siena.edu>
> wrote:
> > > >
> > > > Peter,
> > > >
> > > > I find that I can not compile and simulate your design, however,
> looking at the code structure, I can't tell if tx_val and tx_EOF are high
> at the same time:
> > > >
> > > >
> > > >
> > > > Also, I modified the design to send out a packet of size 8200 once
> per second (model attached) and added a register that latches the GBE
> tx_aful and tx_overrun lines so they can be read through the KATCP
> interface.  Modify the model to remove the oscilloscope and Xilinx out
> gateways before compiling it for your platform.  Note that this model does
> not check for overflow, though the latch will let you know if you have had
> one.
> > > >
> > > > Let me know how this works for you.
> > > >
> > > > -Joe Kujawski
> > > > --
> > > > **************************************
> > > > * Joe Kujawski
> > > > * Siena College
> > > > * Dept. of Physics and Astronomy, RB 113
> > > > * 515 Loudon Road
> > > > * Loudonville, NY 12211-1462
> > > > *
> > > > * Email: jkujaw...@siena.edu
> > > > * Phone: 518-867-7509  <-- NEW NUMBER
> > > > * Fax: 518-783-2986
> > > > **************************************
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > **************************************
> > > > * Joe Kujawski
> > > > * Siena College
> > > > * Dept. of Physics and Astronomy, RB 113
> > > > * 515 Loudon Road
> > > > * Loudonville, NY 12211-1462
> > > > *
> > > > * Email: jkujaw...@siena.edu
> > > > * Phone: 518-867-7509  <-- NEW NUMBER
> > > > * Fax: 518-783-2986
> > > > **************************************
> > > >
> > > > 从网易163邮箱发来的云附件
> > > >
> > > > paperfengine.zip (126.71M, 2014年11月7日 22:58 到期)
> > > > 下载
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > **************************************
> > > > * Joe Kujawski
> > > > * Siena College
> > > > * Dept. of Physics and Astronomy, RB 113
> > > > * 515 Loudon Road
> > > > * Loudonville, NY 12211-1462
> > > > *
> > > > * Email: jkujaw...@siena.edu
> > > > * Phone: 518-867-7509  <-- NEW NUMBER
> > > > * Fax: 518-783-2986
> > > > **************************************
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > **************************************
> > > > * Joe Kujawski
> > > > * Siena College
> > > > * Dept. of Physics and Astronomy, RB 113
> > > > * 515 Loudon Road
> > > > * Loudonville, NY 12211-1462
> > > > *
> > > > * Email: jkujaw...@siena.edu
> > > > * Phone: 518-867-7509  <-- NEW NUMBER
> > > > * Fax: 518-783-2986
> > > > **************************************
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

Reply via email to