Thanks for sharing, Ben!

It's not easy to acknowledge solving "silly" problems of one's own making, but 
it is very valuable/helpful to have these explained on the mailing list.  
Things like this are far more common than one would perhaps care to admit, 
especially among more experienced CASPER folks :P, so it's nice to have a 
reminder in the archives for future victims of such self-induced silliness.

FWIW, my Achilles heel is usually forgetting to increase the MTU setting on the 
switch ports and/or the network interfaces of the receiving computer.

It might be useful and maybe even fun(ny?) to start a poll on the CASPER wiki 
to find out which is the most common and/or silliest silly problem.  In the end 
I find these types of problems have equal parts frustration ("why isn't this 
working?!"), embarrassment/astonishment ("I can't believe I did/forgot 
that!?"), and relief/joy ("yay, it's working now!!"), usually in that order. :)

Cheers,
Dave

> On Oct 22, 2020, at 17:06, 'Benjamin Godfrey' via [email protected] 
> <[email protected]> wrote:
> 
> Hi Marc and Jack, 
>    Thank you again for all the suggestions. After fiddling for a while, the 
> answer ended up being sillier than I expected: I had selected the wrong slot 
> number on the ten gbe yellow block, and I was also using the wrong SFP+ port 
> on the PC side of things. No doubt there will be other issues that come up, 
> but I can at least digitize mock data using the ROACH2 and read it using a 
> Python script now. 
> 
> - Ben G.
> 
> On Tue, Oct 20, 2020 at 3:02 AM Marc <[email protected] <mailto:[email protected]>> 
> wrote:
> Hi
> 
> Hmm... if you are capable of pinging things in one direction, then
> tcpborphserver is at least partially up - amongst other things, it is
> responsible for picking up frames from the fpga and handing them
> off to the kernel, which then does the IP logic and vice versa.
> 
> You seem to have problems with arp, and say that you have
> prepopulated the arp tables on the roach with set arp - maybe
> you will have to do the same on the PC side. Linux at least
> as an "arp -s" command to hardcode them into the PC arp
> cache (cat /proc/net/arp).
> 
> Note that the roaches do arp in an usual way - they iterate over
> the subnet (fixed size) and query the hardware addresses
> periodically and pre-emptively, unlike normal arp which only does
> that on demand. This is needed as the ppc/tcpborphserver
> might have no idea which stations the fpga is trying
> to reach. So if you run tcpdump on a PC, you should
> see these queries all the time, if the tap device is up.
> 
> There are commands like ?tap-info and ?tap-arp-reload
> which might give you more detail, either on the roach
> type "kcpcmd tap-info", or remotely
> 
> echo "?tap-info" | nc -q 2 -w 2 ip-of-roach 7147
> 
> Note that you will have to use those commands, rather
> then looking in /proc/net/arp on the roach, as arp
> isn't handled by the ppc linux kernel - those tables
> have to be shared with the fpga.
> 
> regards
> 
> marc
> 
> On Tue, Oct 20, 2020 at 8:42 AM 'Benjamin Godfrey' via
> [email protected] <mailto:[email protected]> 
> <[email protected] <mailto:[email protected]>> wrote:
> >
> > Hi Jack,
> >    Thank you for all your suggestions. Really appreciate all the 
> > troubleshooting help. Going through your suggestions in order:
> >
> > - EOF is going low with the final valid signal in simulation
> > - But valid is always high when I read the snapshot block, which is 
> > unexpected (need to dig further to figure out why this is happening). EOF, 
> > though, is still going high for one clock cycle at the  expected time.
> > - Reading from the transmit full output reports false, but I don't really 
> > understand this since the valid signal is always high.
> >
> > I was having issues with the tap interface populating the ARP table with 
> > correct addresses so I've now taken to populating it manually (using 
> > set_arp_table, which I found in the docs). Furthermore, I've had problems 
> > being able to ping the ROACH from the PC. I am now able to ping the PC 
> > logged into the ROACH, but I am unable to ping the ROACH from the PC side. 
> > Do you know why this may be the case?
> >
> > I definitely have some paths to explore.
> >
> > Thanks,
> > Ben G.
> >
> > On Tue, Oct 20, 2020 at 12:56 AM Jack Hickish <[email protected] 
> > <mailto:[email protected]>> wrote:
> >>
> >> Hi Ben,
> >>
> >> Before getting too far into the power PC software side, some basic checks 
> >> in firmware which are probably worth doing -
> >>
> >> - does EOF go high with (not after) the last valid sample?
> >> - can you (using a snapshot block) verify that what is happening in 
> >> firmware with the vld / EOF signals matches your simulation?
> >> - do you have the ability to read the Tge overflow outputs, which are a 
> >> good indicator of something going awry?
> >>
> >> If you compile with the "enable core on startup" option on the The block 
> >> checked, you should be able to transmit regardless of the software "tap" 
> >> interface. The tap interface is needed to handle things like ARP, but even 
> >> without it you should see packets coming out of your board on tcpdump, 
> >> even if they all have the wrong destination MAC addresses.
> >>
> >> Good luck!
> >>
> >> Jack
> >>
> >>
> >> On Mon, 19 Oct 2020, 6:48 pm 'Benjamin Godfrey' via 
> >> [email protected] <mailto:[email protected]>, 
> >> <[email protected] <mailto:[email protected]>> wrote:
> >>>
> >>> Hi everyone,
> >>>     I've been getting my feet wet the last little while introducing 
> >>> myself to using the ROACH 2 toolset. After being unable to get Tutorial 2 
> >>> to work out of the box, I took a step back and am trying to just transmit 
> >>> packets using the SFP+ port to my PC and read them out. My design is 
> >>> heavily based on the one in the Roach 2 tutorial except that I am using 
> >>> the katadc to generate data. What I've tried so far is detailed below:
> >>>
> >>> Right now, I am trying to send 64, 64-bit samples at ~390 kHz (feeding in 
> >>> an 800 MHz clock) so I shouldn't be overfilling buffers. In simulation, 
> >>> it looks like tx_valid and tx_end_of_frame are both being set as I would 
> >>> expect, namely tx_valid goes high whenever I am sending data and 
> >>> tx_end_of_frame goes high for one clock cycle at the end of when I expect 
> >>> to send data.
> >>>
> >>> On the PC side of things, I also wasn't able to get the Python script to 
> >>> work out of the box, so I modified it a little bit using suggestions from 
> >>> the mail archives. The important details are below:
> >>>
> >>> ip_base = 192*(2**24) + 168*(2**16) + 41*(2**8)
> >>> mac_base = (2<<40) + (2<<32)
> >>> fabric_port = 60000
> >>> gbe_tx = casperfpga.tengbe.TenGbe(fpga, 'gbe0', ip_base+20, 512)
> >>> gbe_tx.setup(mac_base+20, ip_base+20, fabric_port)
> >>> gbe_tx.tap_start()
> >>>
> >>> I am trying to write data to 192.168.41.1 (same subnet as the ROACH2), on 
> >>> my PC, connected to the ROACH2 using an SFP+ cable. To do this, I've used 
> >>> the socket library in Python configured to listen for UDP packets at the 
> >>> required IP/port. However, I am unable to get any data from the ROACH2 
> >>> whatsoever. I've looked to see if I have any packets coming over the port 
> >>> using Wireshark and I see nothing.
> >>>
> >>> Something that I've noticed, is that I am unable to ping the gbe port 
> >>> (after writing the design to the board). When I look at ifconfig after 
> >>> ssh-ing into the ROACH, I see the following:
> >>>
> >>> gbe 0   Link encap: UNSPEC  HWaddr 
> >>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> >>>             inet addr:192.168.41.20 P-t-P:192.168.41.20 Mask:255.255.255.0
> >>>             UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
> >>>             RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>>             TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >>>             collisions:0 txqueuelen: 500
> >>>             RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> >>>
> >>> I think this looks reasonable other than the lack of packets being sent. 
> >>> I'm now off playing with routing tables thinking that this may be my 
> >>> issue, but I am honestly pretty in the weeds at this point. Do you guys 
> >>> have some suggestions? I'm sure there are a few things I've fouled up 
> >>> along the way.
> >>>
> >>> Thanks for the help,
> >>> Ben G.
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google Groups 
> >>> "[email protected] <mailto:[email protected]>" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send an 
> >>> email to [email protected] 
> >>> <mailto:casper%[email protected]>.
> >>> To view this discussion on the web visit 
> >>> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/4634df92-fb67-4bf5-bcab-22478a4c952cn%40lists.berkeley.edu
> >>>  
> >>> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/4634df92-fb67-4bf5-bcab-22478a4c952cn%40lists.berkeley.edu>.
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups 
> >> "[email protected] <mailto:[email protected]>" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an 
> >> email to [email protected] 
> >> <mailto:casper%[email protected]>.
> >> To view this discussion on the web visit 
> >> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAG1GKS%3D2h_S-rwBOJKSc%3D7-EXt8zDDeCeBC%2BYnKREuevOZJN9w%40mail.gmail.com
> >>  
> >> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAG1GKS%3D2h_S-rwBOJKSc%3D7-EXt8zDDeCeBC%2BYnKREuevOZJN9w%40mail.gmail.com>.
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "[email protected] <mailto:[email protected]>" group.
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to [email protected] 
> > <mailto:casper%[email protected]>.
> > To view this discussion on the web visit 
> > https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAhVn0MBUMApKy%2BECvhp%2B-vbo_%2BxWTUh2VYSSvwZAbNPzQ%40mail.gmail.com
> >  
> > <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAhVn0MBUMApKy%2BECvhp%2B-vbo_%2BxWTUh2VYSSvwZAbNPzQ%40mail.gmail.com>.
> 
> 
> 
> -- 
> https://katfs.kat.ac.za/~marc/ <https://katfs.kat.ac.za/~marc/>
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "[email protected] <mailto:[email protected]>" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:casper%[email protected]>.
> To view this discussion on the web visit 
> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAGrhWaRPp_eyZ1aez%3DemSnONwzPPwNncbmRe0n1iU_oTUP-%3DhQ%40mail.gmail.com
>  
> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAGrhWaRPp_eyZ1aez%3DemSnONwzPPwNncbmRe0n1iU_oTUP-%3DhQ%40mail.gmail.com>.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "[email protected]" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To view this discussion on the web visit 
> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAh_ojtRhpSkJfVcab1oqXT9NSB%3D1XvTgjtXy9apnAZBiA%40mail.gmail.com
>  
> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAh_ojtRhpSkJfVcab1oqXT9NSB%3D1XvTgjtXy9apnAZBiA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/290E096A-5499-468A-A962-77FC60A89F14%40berkeley.edu.

Reply via email to