Thanks for sharing, Ben!
It's not easy to acknowledge solving "silly" problems of one's own making, but
it is very valuable/helpful to have these explained on the mailing list.
Things like this are far more common than one would perhaps care to admit,
especially among more experienced CASPER folks :P, so it's nice to have a
reminder in the archives for future victims of such self-induced silliness.
FWIW, my Achilles heel is usually forgetting to increase the MTU setting on the
switch ports and/or the network interfaces of the receiving computer.
It might be useful and maybe even fun(ny?) to start a poll on the CASPER wiki
to find out which is the most common and/or silliest silly problem. In the end
I find these types of problems have equal parts frustration ("why isn't this
working?!"), embarrassment/astonishment ("I can't believe I did/forgot
that!?"), and relief/joy ("yay, it's working now!!"), usually in that order. :)
Cheers,
Dave
> On Oct 22, 2020, at 17:06, 'Benjamin Godfrey' via [email protected]
> <[email protected]> wrote:
>
> Hi Marc and Jack,
> Thank you again for all the suggestions. After fiddling for a while, the
> answer ended up being sillier than I expected: I had selected the wrong slot
> number on the ten gbe yellow block, and I was also using the wrong SFP+ port
> on the PC side of things. No doubt there will be other issues that come up,
> but I can at least digitize mock data using the ROACH2 and read it using a
> Python script now.
>
> - Ben G.
>
> On Tue, Oct 20, 2020 at 3:02 AM Marc <[email protected] <mailto:[email protected]>>
> wrote:
> Hi
>
> Hmm... if you are capable of pinging things in one direction, then
> tcpborphserver is at least partially up - amongst other things, it is
> responsible for picking up frames from the fpga and handing them
> off to the kernel, which then does the IP logic and vice versa.
>
> You seem to have problems with arp, and say that you have
> prepopulated the arp tables on the roach with set arp - maybe
> you will have to do the same on the PC side. Linux at least
> as an "arp -s" command to hardcode them into the PC arp
> cache (cat /proc/net/arp).
>
> Note that the roaches do arp in an usual way - they iterate over
> the subnet (fixed size) and query the hardware addresses
> periodically and pre-emptively, unlike normal arp which only does
> that on demand. This is needed as the ppc/tcpborphserver
> might have no idea which stations the fpga is trying
> to reach. So if you run tcpdump on a PC, you should
> see these queries all the time, if the tap device is up.
>
> There are commands like ?tap-info and ?tap-arp-reload
> which might give you more detail, either on the roach
> type "kcpcmd tap-info", or remotely
>
> echo "?tap-info" | nc -q 2 -w 2 ip-of-roach 7147
>
> Note that you will have to use those commands, rather
> then looking in /proc/net/arp on the roach, as arp
> isn't handled by the ppc linux kernel - those tables
> have to be shared with the fpga.
>
> regards
>
> marc
>
> On Tue, Oct 20, 2020 at 8:42 AM 'Benjamin Godfrey' via
> [email protected] <mailto:[email protected]>
> <[email protected] <mailto:[email protected]>> wrote:
> >
> > Hi Jack,
> > Thank you for all your suggestions. Really appreciate all the
> > troubleshooting help. Going through your suggestions in order:
> >
> > - EOF is going low with the final valid signal in simulation
> > - But valid is always high when I read the snapshot block, which is
> > unexpected (need to dig further to figure out why this is happening). EOF,
> > though, is still going high for one clock cycle at the expected time.
> > - Reading from the transmit full output reports false, but I don't really
> > understand this since the valid signal is always high.
> >
> > I was having issues with the tap interface populating the ARP table with
> > correct addresses so I've now taken to populating it manually (using
> > set_arp_table, which I found in the docs). Furthermore, I've had problems
> > being able to ping the ROACH from the PC. I am now able to ping the PC
> > logged into the ROACH, but I am unable to ping the ROACH from the PC side.
> > Do you know why this may be the case?
> >
> > I definitely have some paths to explore.
> >
> > Thanks,
> > Ben G.
> >
> > On Tue, Oct 20, 2020 at 12:56 AM Jack Hickish <[email protected]
> > <mailto:[email protected]>> wrote:
> >>
> >> Hi Ben,
> >>
> >> Before getting too far into the power PC software side, some basic checks
> >> in firmware which are probably worth doing -
> >>
> >> - does EOF go high with (not after) the last valid sample?
> >> - can you (using a snapshot block) verify that what is happening in
> >> firmware with the vld / EOF signals matches your simulation?
> >> - do you have the ability to read the Tge overflow outputs, which are a
> >> good indicator of something going awry?
> >>
> >> If you compile with the "enable core on startup" option on the The block
> >> checked, you should be able to transmit regardless of the software "tap"
> >> interface. The tap interface is needed to handle things like ARP, but even
> >> without it you should see packets coming out of your board on tcpdump,
> >> even if they all have the wrong destination MAC addresses.
> >>
> >> Good luck!
> >>
> >> Jack
> >>
> >>
> >> On Mon, 19 Oct 2020, 6:48 pm 'Benjamin Godfrey' via
> >> [email protected] <mailto:[email protected]>,
> >> <[email protected] <mailto:[email protected]>> wrote:
> >>>
> >>> Hi everyone,
> >>> I've been getting my feet wet the last little while introducing
> >>> myself to using the ROACH 2 toolset. After being unable to get Tutorial 2
> >>> to work out of the box, I took a step back and am trying to just transmit
> >>> packets using the SFP+ port to my PC and read them out. My design is
> >>> heavily based on the one in the Roach 2 tutorial except that I am using
> >>> the katadc to generate data. What I've tried so far is detailed below:
> >>>
> >>> Right now, I am trying to send 64, 64-bit samples at ~390 kHz (feeding in
> >>> an 800 MHz clock) so I shouldn't be overfilling buffers. In simulation,
> >>> it looks like tx_valid and tx_end_of_frame are both being set as I would
> >>> expect, namely tx_valid goes high whenever I am sending data and
> >>> tx_end_of_frame goes high for one clock cycle at the end of when I expect
> >>> to send data.
> >>>
> >>> On the PC side of things, I also wasn't able to get the Python script to
> >>> work out of the box, so I modified it a little bit using suggestions from
> >>> the mail archives. The important details are below:
> >>>
> >>> ip_base = 192*(2**24) + 168*(2**16) + 41*(2**8)
> >>> mac_base = (2<<40) + (2<<32)
> >>> fabric_port = 60000
> >>> gbe_tx = casperfpga.tengbe.TenGbe(fpga, 'gbe0', ip_base+20, 512)
> >>> gbe_tx.setup(mac_base+20, ip_base+20, fabric_port)
> >>> gbe_tx.tap_start()
> >>>
> >>> I am trying to write data to 192.168.41.1 (same subnet as the ROACH2), on
> >>> my PC, connected to the ROACH2 using an SFP+ cable. To do this, I've used
> >>> the socket library in Python configured to listen for UDP packets at the
> >>> required IP/port. However, I am unable to get any data from the ROACH2
> >>> whatsoever. I've looked to see if I have any packets coming over the port
> >>> using Wireshark and I see nothing.
> >>>
> >>> Something that I've noticed, is that I am unable to ping the gbe port
> >>> (after writing the design to the board). When I look at ifconfig after
> >>> ssh-ing into the ROACH, I see the following:
> >>>
> >>> gbe 0 Link encap: UNSPEC HWaddr
> >>> 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
> >>> inet addr:192.168.41.20 P-t-P:192.168.41.20 Mask:255.255.255.0
> >>> UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
> >>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >>> collisions:0 txqueuelen: 500
> >>> RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
> >>>
> >>> I think this looks reasonable other than the lack of packets being sent.
> >>> I'm now off playing with routing tables thinking that this may be my
> >>> issue, but I am honestly pretty in the weeds at this point. Do you guys
> >>> have some suggestions? I'm sure there are a few things I've fouled up
> >>> along the way.
> >>>
> >>> Thanks for the help,
> >>> Ben G.
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google Groups
> >>> "[email protected] <mailto:[email protected]>" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send an
> >>> email to [email protected]
> >>> <mailto:casper%[email protected]>.
> >>> To view this discussion on the web visit
> >>> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/4634df92-fb67-4bf5-bcab-22478a4c952cn%40lists.berkeley.edu
> >>>
> >>> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/4634df92-fb67-4bf5-bcab-22478a4c952cn%40lists.berkeley.edu>.
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "[email protected] <mailto:[email protected]>" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to [email protected]
> >> <mailto:casper%[email protected]>.
> >> To view this discussion on the web visit
> >> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAG1GKS%3D2h_S-rwBOJKSc%3D7-EXt8zDDeCeBC%2BYnKREuevOZJN9w%40mail.gmail.com
> >>
> >> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAG1GKS%3D2h_S-rwBOJKSc%3D7-EXt8zDDeCeBC%2BYnKREuevOZJN9w%40mail.gmail.com>.
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "[email protected] <mailto:[email protected]>" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to [email protected]
> > <mailto:casper%[email protected]>.
> > To view this discussion on the web visit
> > https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAhVn0MBUMApKy%2BECvhp%2B-vbo_%2BxWTUh2VYSSvwZAbNPzQ%40mail.gmail.com
> >
> > <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAhVn0MBUMApKy%2BECvhp%2B-vbo_%2BxWTUh2VYSSvwZAbNPzQ%40mail.gmail.com>.
>
>
>
> --
> https://katfs.kat.ac.za/~marc/ <https://katfs.kat.ac.za/~marc/>
>
> --
> You received this message because you are subscribed to the Google Groups
> "[email protected] <mailto:[email protected]>" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected]
> <mailto:casper%[email protected]>.
> To view this discussion on the web visit
> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAGrhWaRPp_eyZ1aez%3DemSnONwzPPwNncbmRe0n1iU_oTUP-%3DhQ%40mail.gmail.com
>
> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAGrhWaRPp_eyZ1aez%3DemSnONwzPPwNncbmRe0n1iU_oTUP-%3DhQ%40mail.gmail.com>.
>
> --
> You received this message because you are subscribed to the Google Groups
> "[email protected]" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected]
> <mailto:[email protected]>.
> To view this discussion on the web visit
> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAh_ojtRhpSkJfVcab1oqXT9NSB%3D1XvTgjtXy9apnAZBiA%40mail.gmail.com
>
> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAERfaAh_ojtRhpSkJfVcab1oqXT9NSB%3D1XvTgjtXy9apnAZBiA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/290E096A-5499-468A-A962-77FC60A89F14%40berkeley.edu.