Re: [casper] Roach1 Host name lookup error.
Hi Dave, Here is the configuration of the network. The host computer, a ROACH1 and a working ROACH2 running in soloboot are the only ones connected. The host is 192.168.40.1 and is offering 192.168.100.50 and the ROACH2 is assigned to 192.168.40.50. Is there anything weird about how the ROACH1 handles larger subnets like below? Or maybe infinite address leases? eth1 Link encap:Ethernet HWaddr 00:08:54:54:d3:f5 inet addr:192.168.40.1 Bcast:192.168.255.255 Mask:255.255.0.0 inet6 addr: fe80::208:54ff:fe54:d3f5/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:705 errors:0 dropped:0 overruns:0 frame:0 TX packets:1377 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:210574 (210.5 KB) TX bytes:302407 (302.4 KB) Interrupt:20 Base address:0x6000 Here is the tcpdump: tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 19:46:49.968388 02:00:00:03:01:91 ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 343: (tos 0x0, ttl 255, id 225, offset 0, flags [DF], proto UDP (17), length 329) 0.0.0.0.68 255.255.255.255.67: BOOTP/DHCP, Request from 02:00:00:03:01:91, length 301, xid 0x144252, secs 1130, Flags [none] Client-Ethernet-Address 02:00:00:03:01:91 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 576 Parameter-Request Option 55, length 5: Subnet-Mask, Default-Gateway, Hostname, BS RP 19:46:49.968953 00:08:54:54:d3:f5 02:00:00:03:01:91, ethertype IPv4 (0x0800), length 349: (tos 0x0, ttl 64, id 33388, offset 0, flags [none], proto UDP (17), length 335) 192.168.40.1.67 192.168.100.50.68: BOOTP/DHCP, Reply, length 307, xid 0x144252, secs 1130, Flags [none] Your-IP 192.168.100.50 Server-IP 192.168.40.1 Client-Ethernet-Address 02:00:00:03:01:91 file uImage Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.40.1 Lease-Time Option 51, length 4: 4294967295 Subnet-Mask Option 1, length 4: 255.255.0.0 Hostname Option 12, length 8: roach1-4 RP Option 17, length 33: 192.168.40.1:/srv/roach_boot/etch 19:46:52.971116 02:00:00:03:01:91 ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 343: (tos 0x0, ttl 255, id 226, offset 0, flags [DF], proto UDP (17), length 329) 0.0.0.0.68 255.255.255.255.67: BOOTP/DHCP, Request from 02:00:00:03:01:91, length 301, xid 0x144e0d, secs 1133, Flags [none] Client-Ethernet-Address 02:00:00:03:01:91 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 576 Parameter-Request Option 55, length 5: Subnet-Mask, Default-Gateway, Hostname, BS RP 19:46:52.971744 00:08:54:54:d3:f5 02:00:00:03:01:91, ethertype IPv4 (0x0800), length 349: (tos 0x0, ttl 64, id 34083, offset 0, flags [none], proto UDP (17), length 335) 192.168.40.1.67 192.168.100.50.68: BOOTP/DHCP, Reply, length 307, xid 0x144e0d, secs 1133, Flags [none] Your-IP 192.168.100.50 Server-IP 192.168.40.1 Client-Ethernet-Address 02:00:00:03:01:91 file uImage Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.40.1 Lease-Time Option 51, length 4: 4294967295 Subnet-Mask Option 1, length 4: 255.255.0.0 Hostname Option 12, length 8: roach1-4 RP Option 17, length 33: 192.168.40.1:/srv/roach_boot/etch Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Tue, May 26, 2015 at 7:42 PM, David MacMahon dav...@astro.berkeley.edu wrote: Weird. Are there any other hosts on the network that might be also sending (non-netboot-aware) DHCP offers? What does sudo tcpdump -i eth0 -n -e -v port bootps or port bootpc show (replacing eth0 with the actual network interface name where the DHCP activity is). Dave On May 26, 2015, at 4:34 PM, Brad Dober wrote: Hi Dave, I switched to a 100 Mbps switch, and now I'm still getting the ROACH1 continuously sending DCHP discovers, and my host computer continuously sending offers, but now the occasional request/acknowledge and uboot download is no longer happening. For what it's worth, I am not using jumbo frames. Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Tue, May 26, 2015 at 7:21 PM, David MacMahon dav...@astro.berkeley.edu wrote: Are you trying to run the ROACH1 on 1 GbE? ROACH1 is not reliable on 1 GbE. You have to force it to be 100 Mbps. This can be done by using an unmanaged non-gigabit switch (or hub), a managed switch that can force its port for the ROACH1 to be 100 Mbsp only. For direct connect, you'll have to use mii-tool on
Re: [casper] Roach1 Host name lookup error.
Nothing obvious comes to mind (yet). Can you watch the roach1 boot process via serial console? What does that show? Are you using dnsmasq for the DHCP server? What if you try direct connect with mii-tool to set the speed of eth1 to 100 Mbps? Dave On May 26, 2015, at 5:00 PM, Brad Dober wrote: Hi Dave, Here is the configuration of the network. The host computer, a ROACH1 and a working ROACH2 running in soloboot are the only ones connected. The host is 192.168.40.1 and is offering 192.168.100.50 and the ROACH2 is assigned to 192.168.40.50. Is there anything weird about how the ROACH1 handles larger subnets like below? Or maybe infinite address leases? eth1 Link encap:Ethernet HWaddr 00:08:54:54:d3:f5 inet addr:192.168.40.1 Bcast:192.168.255.255 Mask:255.255.0.0 inet6 addr: fe80::208:54ff:fe54:d3f5/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:705 errors:0 dropped:0 overruns:0 frame:0 TX packets:1377 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:210574 (210.5 KB) TX bytes:302407 (302.4 KB) Interrupt:20 Base address:0x6000 Here is the tcpdump: tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 19:46:49.968388 02:00:00:03:01:91 ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 343: (tos 0x0, ttl 255, id 225, offset 0, flags [DF], proto UDP (17), length 329) 0.0.0.0.68 255.255.255.255.67: BOOTP/DHCP, Request from 02:00:00:03:01:91, length 301, xid 0x144252, secs 1130, Flags [none] Client-Ethernet-Address 02:00:00:03:01:91 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 576 Parameter-Request Option 55, length 5: Subnet-Mask, Default-Gateway, Hostname, BS RP 19:46:49.968953 00:08:54:54:d3:f5 02:00:00:03:01:91, ethertype IPv4 (0x0800), length 349: (tos 0x0, ttl 64, id 33388, offset 0, flags [none], proto UDP (17), length 335) 192.168.40.1.67 192.168.100.50.68: BOOTP/DHCP, Reply, length 307, xid 0x144252, secs 1130, Flags [none] Your-IP 192.168.100.50 Server-IP 192.168.40.1 Client-Ethernet-Address 02:00:00:03:01:91 file uImage Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.40.1 Lease-Time Option 51, length 4: 4294967295 Subnet-Mask Option 1, length 4: 255.255.0.0 Hostname Option 12, length 8: roach1-4 RP Option 17, length 33: 192.168.40.1:/srv/roach_boot/etch 19:46:52.971116 02:00:00:03:01:91 ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 343: (tos 0x0, ttl 255, id 226, offset 0, flags [DF], proto UDP (17), length 329) 0.0.0.0.68 255.255.255.255.67: BOOTP/DHCP, Request from 02:00:00:03:01:91, length 301, xid 0x144e0d, secs 1133, Flags [none] Client-Ethernet-Address 02:00:00:03:01:91 Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Discover MSZ Option 57, length 2: 576 Parameter-Request Option 55, length 5: Subnet-Mask, Default-Gateway, Hostname, BS RP 19:46:52.971744 00:08:54:54:d3:f5 02:00:00:03:01:91, ethertype IPv4 (0x0800), length 349: (tos 0x0, ttl 64, id 34083, offset 0, flags [none], proto UDP (17), length 335) 192.168.40.1.67 192.168.100.50.68: BOOTP/DHCP, Reply, length 307, xid 0x144e0d, secs 1133, Flags [none] Your-IP 192.168.100.50 Server-IP 192.168.40.1 Client-Ethernet-Address 02:00:00:03:01:91 file uImage Vendor-rfc1048 Extensions Magic Cookie 0x63825363 DHCP-Message Option 53, length 1: Offer Server-ID Option 54, length 4: 192.168.40.1 Lease-Time Option 51, length 4: 4294967295 Subnet-Mask Option 1, length 4: 255.255.0.0 Hostname Option 12, length 8: roach1-4 RP Option 17, length 33: 192.168.40.1:/srv/roach_boot/etch Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Tue, May 26, 2015 at 7:42 PM, David MacMahon dav...@astro.berkeley.edu wrote: Weird. Are there any other hosts on the network that might be also sending (non-netboot-aware) DHCP offers? What does sudo tcpdump -i eth0 -n -e -v port bootps or port bootpc show (replacing eth0 with the actual network interface name where the DHCP activity is). Dave On May 26, 2015, at 4:34 PM, Brad Dober wrote: Hi Dave, I switched to a 100 Mbps switch, and now I'm still getting the ROACH1 continuously sending DCHP discovers, and my host computer continuously sending offers,
Re: [casper] casperfpga on a ROACH 1 - problem with get_system_information
Hi all, For the information of anyone else who might encounter a similar problem to what I did - the NotImplementedError means that the device just wasn't implemented in casperfpga yet. In this case though it didn't really need to be because the katadc is pretty simple compared with newer meerkatADC (from what I understood). This caused the get_sytem_information() function to bomb and not finish processing the snapshot blocks, which is why they weren't working. I've made a small change to the code and submitted a pull request, so it should be fixed for katadc in the future, but if anyone else comes across similar problems then that might be a good place to start looking for a solution. Regards, James On Tue, May 19, 2015 at 12:49 PM, James Smith jsm...@ska.ac.za wrote: Hello again, Casperites, Further tracking of my challenge, I removed the KatADC from the model and the problem went away. I'm not sure why that NotImplementedError is raised. However, even with the KatADC gone, and after loading the system information, I tried to read from a snapshot block and this is what I got: In [23]: fpga.snapshots.led_snap_ss.read() --- KatcpRequestError Traceback (most recent call last) ipython-input-23-4835bbdb7b15 in module() 1 fpga.snapshots.led_snap_ss.read() /usr/local/lib/python2.7/dist-packages/casperfpga/snap.pyc in read(self, **kwargs) 188 if kkey not in ['circular_capture', 'man_trig', 'man_valid', 'offset', 'timeout', 'arm']: 189 raise RuntimeError('Invalid option for snap read: %s' % kkey) *-- 190 rawdata, rawtime = self.read_raw(**kwargs)* 191 # processed = self._process_data_no_construct(rawdata['data']) 192 processed = self._process_data(rawdata['data']) /usr/local/lib/python2.7/dist-packages/casperfpga/snap.pyc in read_raw(self, **kwargs) 212 arm = getkwarg('arm', True) 213 if arm: *-- 214 self.arm(man_trig=man_trig, man_valid=man_valid, offset=offset, circular_capture=circular_capture)* 215 # wait 216 done = False /usr/local/lib/python2.7/dist-packages/casperfpga/snap.pyc in arm(self, man_trig, man_valid, offset, circular_capture) 150 (man_trig 1) + 151 (man_valid 2) + *-- 152 (circular_capture 3)))* 153 self.control_registers['control']['register'].write_int((1 + 154 (man_trig 1) + /usr/local/lib/python2.7/dist-packages/casperfpga/register.pyc in write_int(self, uintvalue, blindwrite, word_offset) 87 Write an unsigned integer to this device using the fpga client. 88 *--- 89 self.parent.write_int(device_name=self.name http://self.name, integer=uintvalue, blindwrite=blindwrite, word_offset=word_offset)* 90 91 def _write_common(self, **kwargs): /usr/local/lib/python2.7/dist-packages/casperfpga/casperfpga.pyc in write_int(self, device_name, integer, blindwrite, word_offset) 243 self.blindwrite(device_name, data, word_offset*4) 244 else: *-- 245 self.write(device_name, data, word_offset*4)* 246 LOGGER.debug('write_int %8x to register %s at word offset %d okay%s.' 247 % (integer, device_name, word_offset, ' (blind)' if blindwrite else '')) /usr/local/lib/python2.7/dist-packages/casperfpga/casperfpga.pyc in write(self, device_name, data, offset) 193 :return: 194 *-- 195 self.blindwrite(device_name, data, offset)* 196 new_data = self.read(device_name, len(data), offset) 197 if new_data != data: /usr/local/lib/python2.7/dist-packages/casperfpga/katcp_fpga.pyc in blindwrite(self, device_name, data, offset) 199 assert((offset % 4) == 0), 'You must write 32-bit-bounded words!' 200 self.katcprequest(name='write', request_timeout=self._timeout, require_ok=True, *-- 201 request_args=(device_name, str(offset), data))* 202 203 def bulkread(self, device_name, size, offset=0): /usr/local/lib/python2.7/dist-packages/casperfpga/katcp_fpga.pyc in katcprequest(self, name, request_timeout, require_ok, request_args) 116 raise KatcpRequestError( 117 'Request %s on host %s failed.\n\tRequest: %s\n\tReply: %s' % *-- 118 (request.name http://request.name, self.host, request, reply))* 119 return reply, informs 120 KatcpRequestError: Request write on host localhost failed. Request: ?write led_snap_ss_ctrl 0 \0\0\0\0 Reply: !write fail register Reading and writing other registers works just fine. Telnetting into the ROACH and trying to write the snap_ss_ctrl register that way also fails, so it's quite likely that the problem is not with the
Re: [casper] Roach1 Host name lookup error.
I've switched to NFS boot to avoid SD card corruptions. However, when attempting to run netboot, the roach will send an IP discover, the host will offer one, and then the roach will send a discover again. This goes on for 10-15 times when finally the roach will request the correct IP, and the host will acknowledge. The roach will then begin the tftp of the uboot image, but will request block 1 multiple times, gets sent it, acknowledges once starts getting block 1 and 2 sent and then restarts the whole process asking for an IP request. The whole process seems very strange and I'm having trouble wrapping my head around what could be causing it. Has anyone encountered something similar??? Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Thu, May 21, 2015 at 3:32 AM, Marc Welz m...@ska.ac.za wrote: On Wed, May 20, 2015 at 5:19 PM, Brad Dober do...@sas.upenn.edu wrote: Hi Casperites, I have a Roach1 which is booting from an SD card. I booted it up yesterday, and it was displaying the STALE NFS handle error that other people have seen in the past (which suggested a corrupt flash card). I ran fsck and fixed several errors, and when rebooting, the stale nfs handle error went away. However, now the ROACH could not connect to the network. When I run ifconfig 128.91.46.20 netmask 255.255.248.0 gateway 128.91.4, I get: gateway: Host name lookup failure root@(none):~# hostname -v (none) If you require a hostname, put it in /etc/hostname or similar and then run hostname -f /etc/hostname regards marc
Re: [casper] Roach1 Host name lookup error.
Are you trying to run the ROACH1 on 1 GbE? ROACH1 is not reliable on 1 GbE. You have to force it to be 100 Mbps. This can be done by using an unmanaged non-gigabit switch (or hub), a managed switch that can force its port for the ROACH1 to be 100 Mbsp only. For direct connect, you'll have to use mii-tool on the server. Another thing that always gets me is MTU. I don't think the ROACH1 u-boot supports jumbo frames, so you'll have to run the server with MTU==1500 to netboot ROACH1. Dave On May 26, 2015, at 3:51 PM, Brad Dober wrote: I've switched to NFS boot to avoid SD card corruptions. However, when attempting to run netboot, the roach will send an IP discover, the host will offer one, and then the roach will send a discover again. This goes on for 10-15 times when finally the roach will request the correct IP, and the host will acknowledge. The roach will then begin the tftp of the uboot image, but will request block 1 multiple times, gets sent it, acknowledges once starts getting block 1 and 2 sent and then restarts the whole process asking for an IP request. The whole process seems very strange and I'm having trouble wrapping my head around what could be causing it. Has anyone encountered something similar??? Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Thu, May 21, 2015 at 3:32 AM, Marc Welz m...@ska.ac.za wrote: On Wed, May 20, 2015 at 5:19 PM, Brad Dober do...@sas.upenn.edu wrote: Hi Casperites, I have a Roach1 which is booting from an SD card. I booted it up yesterday, and it was displaying the STALE NFS handle error that other people have seen in the past (which suggested a corrupt flash card). I ran fsck and fixed several errors, and when rebooting, the stale nfs handle error went away. However, now the ROACH could not connect to the network. When I run ifconfig 128.91.46.20 netmask 255.255.248.0 gateway 128.91.4, I get: gateway: Host name lookup failure root@(none):~# hostname -v (none) If you require a hostname, put it in /etc/hostname or similar and then run hostname -f /etc/hostname regards marc
Re: [casper] Roach1 Host name lookup error.
Hi Dave, I switched to a 100 Mbps switch, and now I'm still getting the ROACH1 continuously sending DCHP discovers, and my host computer continuously sending offers, but now the occasional request/acknowledge and uboot download is no longer happening. For what it's worth, I am not using jumbo frames. Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Tue, May 26, 2015 at 7:21 PM, David MacMahon dav...@astro.berkeley.edu wrote: Are you trying to run the ROACH1 on 1 GbE? ROACH1 is not reliable on 1 GbE. You have to force it to be 100 Mbps. This can be done by using an unmanaged non-gigabit switch (or hub), a managed switch that can force its port for the ROACH1 to be 100 Mbsp only. For direct connect, you'll have to use mii-tool on the server. Another thing that always gets me is MTU. I don't think the ROACH1 u-boot supports jumbo frames, so you'll have to run the server with MTU==1500 to netboot ROACH1. Dave On May 26, 2015, at 3:51 PM, Brad Dober wrote: I've switched to NFS boot to avoid SD card corruptions. However, when attempting to run netboot, the roach will send an IP discover, the host will offer one, and then the roach will send a discover again. This goes on for 10-15 times when finally the roach will request the correct IP, and the host will acknowledge. The roach will then begin the tftp of the uboot image, but will request block 1 multiple times, gets sent it, acknowledges once starts getting block 1 and 2 sent and then restarts the whole process asking for an IP request. The whole process seems very strange and I'm having trouble wrapping my head around what could be causing it. Has anyone encountered something similar??? Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Thu, May 21, 2015 at 3:32 AM, Marc Welz m...@ska.ac.za wrote: On Wed, May 20, 2015 at 5:19 PM, Brad Dober do...@sas.upenn.edu wrote: Hi Casperites, I have a Roach1 which is booting from an SD card. I booted it up yesterday, and it was displaying the STALE NFS handle error that other people have seen in the past (which suggested a corrupt flash card). I ran fsck and fixed several errors, and when rebooting, the stale nfs handle error went away. However, now the ROACH could not connect to the network. When I run ifconfig 128.91.46.20 netmask 255.255.248.0 gateway 128.91.4, I get: gateway: Host name lookup failure root@(none):~# hostname -v (none) If you require a hostname, put it in /etc/hostname or similar and then run hostname -f /etc/hostname regards marc
Re: [casper] Roach1 Host name lookup error.
Weird. Are there any other hosts on the network that might be also sending (non-netboot-aware) DHCP offers? What does sudo tcpdump -i eth0 -n -e -v port bootps or port bootpc show (replacing eth0 with the actual network interface name where the DHCP activity is). Dave On May 26, 2015, at 4:34 PM, Brad Dober wrote: Hi Dave, I switched to a 100 Mbps switch, and now I'm still getting the ROACH1 continuously sending DCHP discovers, and my host computer continuously sending offers, but now the occasional request/acknowledge and uboot download is no longer happening. For what it's worth, I am not using jumbo frames. Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Tue, May 26, 2015 at 7:21 PM, David MacMahon dav...@astro.berkeley.edu wrote: Are you trying to run the ROACH1 on 1 GbE? ROACH1 is not reliable on 1 GbE. You have to force it to be 100 Mbps. This can be done by using an unmanaged non-gigabit switch (or hub), a managed switch that can force its port for the ROACH1 to be 100 Mbsp only. For direct connect, you'll have to use mii-tool on the server. Another thing that always gets me is MTU. I don't think the ROACH1 u-boot supports jumbo frames, so you'll have to run the server with MTU==1500 to netboot ROACH1. Dave On May 26, 2015, at 3:51 PM, Brad Dober wrote: I've switched to NFS boot to avoid SD card corruptions. However, when attempting to run netboot, the roach will send an IP discover, the host will offer one, and then the roach will send a discover again. This goes on for 10-15 times when finally the roach will request the correct IP, and the host will acknowledge. The roach will then begin the tftp of the uboot image, but will request block 1 multiple times, gets sent it, acknowledges once starts getting block 1 and 2 sent and then restarts the whole process asking for an IP request. The whole process seems very strange and I'm having trouble wrapping my head around what could be causing it. Has anyone encountered something similar??? Brad Dober Ph.D. Candidate Department of Physics and Astronomy University of Pennsylvania Cell: 262-949-4668 On Thu, May 21, 2015 at 3:32 AM, Marc Welz m...@ska.ac.za wrote: On Wed, May 20, 2015 at 5:19 PM, Brad Dober do...@sas.upenn.edu wrote: Hi Casperites, I have a Roach1 which is booting from an SD card. I booted it up yesterday, and it was displaying the STALE NFS handle error that other people have seen in the past (which suggested a corrupt flash card). I ran fsck and fixed several errors, and when rebooting, the stale nfs handle error went away. However, now the ROACH could not connect to the network. When I run ifconfig 128.91.46.20 netmask 255.255.248.0 gateway 128.91.4, I get: gateway: Host name lookup failure root@(none):~# hostname -v (none) If you require a hostname, put it in /etc/hostname or similar and then run hostname -f /etc/hostname regards marc