Re: [casper] Fwd: Re: SPDO ROACH spectrometer
As you say, NFS mounts work correctly which would indicate that the network is operating as expected. WRT other errors, are you certain that all reads/writes on FPGA are on 32-bit boundaries? Byte-sized and 16-bit reads are supposed to work, but we have found that for some reason they sometimes cause crashes. It doesn't break immediately, but on subsequent bus transactions. These manifest as kernel crashes. Jason On 04 Nov 2009, at 07:56, Kjetil Wormnes wrote: Hi all, For reference I've attached a summary of our problems below, and a few things I have attempted to do to isolate it. The short of it is that we are unable to transfer large amounts of data across the ethernet reliably regardless of; --kernel version --whether we are usb mount or nfs mount root file system. --network protocol used for transfer The way the crash happens varies, and is not repeatable. Sometime it seems to be a userspace crash, sometimes it is a kernel panic. I have been unable to see any real pattern in the crash reports. This to me seems to indicate that the root cause of the problem may be common, and either an obscure kernel problem or possibly something in the interface between the kernel and the hardware or in the hardware itself. It wouldn't be a big effort to re implement our software to run on a remote machine and talk to the ROACH over KATCP, rather than run locally on the ppc. But since it would require a complete rewrite of the software, we haven't tested this yet. Perhaps it is worth trying. The catch is that I am still really unsure whether we are dealing with many symptoms of the same problem; or many different problems. Anyway, I would like to thank you for all your input, and will let you know if and how we find a satisfactory solution. cheers Kjetil Here is the summary: *The problem* The system crashes when downloading large files. There appears to be varying causes for this crash that may or may not have a common underlying reason. I have attempted to isolate the problem by • Downloading using different protocols and software; ssh and two different ftp servers. • Mounting the filesystem over NFS as opposed to USB • Installing well-known and used kernels, and comparing to custom kernels. SSH SSH always crashes with “Invalid MAC on input” or related error messages. This appears to be a problem with SSH. *FTP* System instabilities were observed using two different ftp servers; proftpd and pure-ftpd. In the best case, with pure-ftpd was able to download 2-3 files, each of size about 2GB before system crashing. Looking through the call stack seemed to indicate that the crash happened in EMAC interface functions. (ie ethernet). However, we have no way of knowing whether these crashes are in fact rather side-effects of the USB subsystem misbehaving. Jason from the Casper mailing list has once again reconfirmed that USB on powerpcs is notoriously unreliable. *DIFFERENT KERNELS - DIFFERENT PROBLEMS* Using some kernels (the latest) saw the link unable to come up at all, while both a custom compiled older kernel (a couple of months ago) and a downloaded image, uImage-20091006-mmcfix both saw the link come up, but with all the crashes described. *ELIMINATING USB AS A CAUSE* To eliminate the effects of USB, I mounted the root filesystem remotely using NFS. I make a few observations; *SSH* Still dies from time to time with the Invalid MAC error message. This was expected as we have already pretty much determined that this error is ssh-specific and not related to our other worries. *ETHERNET* Comes up nicely. System mounts remotely and file access has not caused any obvious problems. In fact I have not really had any problems that I can trace directly back to the Ethernet. That being said, the systems seems to crash after a little while with this setup also. The error messages have been varying. Only once has it been a kernel crash, and then, looking at the call stack it no longer appears to crash inside EMAC access functions. The download speeds seem quite variable; but this is probably more likely due to the network since the operating system is over NFS than the ROACH board itself. Jason Manley wrote: Marc Welz or David George built that kernel. They are the best people to ask about this. I've cc'd them, though I'm not sure either would have the config file from that release. It might be easiest to checkout an older svn version. Might I suggest that instead of recording data to a USB HDD, that you rather record it across the network to another computer? If you don't want to use KATCP for dumping the data directly from your FPGA, you can always mount an NFS network share on your ROACH and record the data there. The USB on the PPC platforms are notoriously unreliable. Jason On 03 Nov 2009, at 03:05, Kjetil Wormnes wrote: Hi Jason, Thank you again for your
Re: [casper] program ROACH over JTAG
Hello Suraj Thanks for you explanation. Maybe I don't describe clearly. I mean if I want program the FPAG on ROACH over JTAG. Also, the bit file is generated by Xilinx ISE not CASPER toolflow. In this moment, the ucf file of my design is only included the IO pins of my design. But a lot of pins that ROACH are needed are not included in my design. How could I program the FPGA of ROACH with my bit file? Thanks, C-H Cheng Hello, On Nov 3, 2009, at 6:39 PM, C-H Cheng wrote: Hello All If I want to simulate a design in ISE and generate a bit file to download to ROACH over JTAG. You can do this using the .bit generated by the CASPER toolflow, available in the same location as the .bof. bof files are generated from .bit files using the script 'mkbof' distributed in 'XPS_ROACH_BASE'. A problem I meet is the FPGA pin number assignment. For example, in ISE I select the device is vxs95t and the FPGA pin assignment in ucf file is according to my desing. But ROACH has a lot of FPGA pins which are not in my design but needed for ROACH, sys_clk_n, sys_clk_p, aux0_clk_p, aux_clk_n, ppc_irq_n, ...etc. Part of the magic of the toolflow is that it adds the necessary IO pins to the .ucf depending on which IO blocks you have selected to use (10gbe, adc, etc.). It's part of the reason that IO blocks get special designation as yellowblocks, as they are processed differently for each block. Hence, I can't add these FPGA pins which ROACH is needed into the ucf file of my design. It would definitely be faster to just put the I/O blocks you want to use into a blank model file, and use the CASPER toolflow to generate the .ucf, by un-checking the boxes for 'update system design', 'system generator', and 'ISE/EDK/bitgen' in the 'bee_xps' dialog. This should take about a minute to complete. -Suraj
Re: [casper] Fwd: Re: SPDO ROACH spectrometer
Also, make sure you're running newwer versions of uboot and the CPLD image. Bus settings changed some months back and improved stability significantly. Uboot will report the versions, and I recommend: U-Boot 2008.10-svn2226 (Aug 7 2009 - 16:06:44) ... Monitor Revision: 8.3.1698 CPLD Revision:8.1.0 at the very least, you should have CPLD Revision 8.0.1588. The only outstanding bug that regularly affects me is that u-boot sometimes doesn't detect the PPC's SDRAM on startup. The system then hangs. Replacing the DIMM with registered memory (same as FPGA DIMM) apparently fixes this. Jason On 04 Nov 2009, at 07:56, Kjetil Wormnes wrote: Hi all, For reference I've attached a summary of our problems below, and a few things I have attempted to do to isolate it. The short of it is that we are unable to transfer large amounts of data across the ethernet reliably regardless of; --kernel version --whether we are usb mount or nfs mount root file system. --network protocol used for transfer The way the crash happens varies, and is not repeatable. Sometime it seems to be a userspace crash, sometimes it is a kernel panic. I have been unable to see any real pattern in the crash reports. This to me seems to indicate that the root cause of the problem may be common, and either an obscure kernel problem or possibly something in the interface between the kernel and the hardware or in the hardware itself. It wouldn't be a big effort to re implement our software to run on a remote machine and talk to the ROACH over KATCP, rather than run locally on the ppc. But since it would require a complete rewrite of the software, we haven't tested this yet. Perhaps it is worth trying. The catch is that I am still really unsure whether we are dealing with many symptoms of the same problem; or many different problems. Anyway, I would like to thank you for all your input, and will let you know if and how we find a satisfactory solution. cheers Kjetil Here is the summary: *The problem* The system crashes when downloading large files. There appears to be varying causes for this crash that may or may not have a common underlying reason. I have attempted to isolate the problem by • Downloading using different protocols and software; ssh and two different ftp servers. • Mounting the filesystem over NFS as opposed to USB • Installing well-known and used kernels, and comparing to custom kernels. SSH SSH always crashes with “Invalid MAC on input” or related error messages. This appears to be a problem with SSH. *FTP* System instabilities were observed using two different ftp servers; proftpd and pure-ftpd. In the best case, with pure-ftpd was able to download 2-3 files, each of size about 2GB before system crashing. Looking through the call stack seemed to indicate that the crash happened in EMAC interface functions. (ie ethernet). However, we have no way of knowing whether these crashes are in fact rather side-effects of the USB subsystem misbehaving. Jason from the Casper mailing list has once again reconfirmed that USB on powerpcs is notoriously unreliable. *DIFFERENT KERNELS - DIFFERENT PROBLEMS* Using some kernels (the latest) saw the link unable to come up at all, while both a custom compiled older kernel (a couple of months ago) and a downloaded image, uImage-20091006-mmcfix both saw the link come up, but with all the crashes described. *ELIMINATING USB AS A CAUSE* To eliminate the effects of USB, I mounted the root filesystem remotely using NFS. I make a few observations; *SSH* Still dies from time to time with the Invalid MAC error message. This was expected as we have already pretty much determined that this error is ssh-specific and not related to our other worries. *ETHERNET* Comes up nicely. System mounts remotely and file access has not caused any obvious problems. In fact I have not really had any problems that I can trace directly back to the Ethernet. That being said, the systems seems to crash after a little while with this setup also. The error messages have been varying. Only once has it been a kernel crash, and then, looking at the call stack it no longer appears to crash inside EMAC access functions. The download speeds seem quite variable; but this is probably more likely due to the network since the operating system is over NFS than the ROACH board itself. Jason Manley wrote: Marc Welz or David George built that kernel. They are the best people to ask about this. I've cc'd them, though I'm not sure either would have the config file from that release. It might be easiest to checkout an older svn version. Might I suggest that instead of recording data to a USB HDD, that you rather record it across the network to another computer? If you don't want to use KATCP for dumping the data directly from your FPGA, you can always mount an NFS network share on your ROACH
Re: [casper] program ROACH over JTAG
If you already have a bitstream, simply plug a JTAG programmer into P2 (labelled Xilinx JTAG), and use IMPACT. But if I read your email correctly, you haven't configured clocks or anything so I'm not sure what the point of this exercise is. I agree with Suraj, easiest would be to start with a CASPER-toolflow generated base system and add/modify from there. Jason On 04 Nov 2009, at 10:33, C-H Cheng wrote: Hello Suraj Thanks for you explanation. Maybe I don't describe clearly. I mean if I want program the FPAG on ROACH over JTAG. Also, the bit file is generated by Xilinx ISE not CASPER toolflow. In this moment, the ucf file of my design is only included the IO pins of my design. But a lot of pins that ROACH are needed are not included in my design. How could I program the FPGA of ROACH with my bit file? Thanks, C-H Cheng Hello, On Nov 3, 2009, at 6:39 PM, C-H Cheng wrote: Hello All If I want to simulate a design in ISE and generate a bit file to download to ROACH over JTAG. You can do this using the .bit generated by the CASPER toolflow, available in the same location as the .bof. bof files are generated from .bit files using the script 'mkbof' distributed in 'XPS_ROACH_BASE'. A problem I meet is the FPGA pin number assignment. For example, in ISE I select the device is vxs95t and the FPGA pin assignment in ucf file is according to my desing. But ROACH has a lot of FPGA pins which are not in my design but needed for ROACH, sys_clk_n, sys_clk_p, aux0_clk_p, aux_clk_n, ppc_irq_n, ...etc. Part of the magic of the toolflow is that it adds the necessary IO pins to the .ucf depending on which IO blocks you have selected to use (10gbe, adc, etc.). It's part of the reason that IO blocks get special designation as yellowblocks, as they are processed differently for each block. Hence, I can't add these FPGA pins which ROACH is needed into the ucf file of my design. It would definitely be faster to just put the I/O blocks you want to use into a blank model file, and use the CASPER toolflow to generate the .ucf, by un-checking the boxes for 'update system design', 'system generator', and 'ISE/EDK/bitgen' in the 'bee_xps' dialog. This should take about a minute to complete. -Suraj
[casper] 10.1 designs on BEE2
Hi all, I'm trying to retire an old 7.1 virtual machine to the digital grave that it deserves. I ported a BEE2 design to 10.1, and successfully compiled the relevant bof files for the BEE. Unfortunately, when I run the boffile, the BEE hangs. If I run the process in the background, I can see the /proc/... filesystem is populated with the right registers/brams, but these hang when I try to access these with either cat/echo or C-based commands (which work in an identical 7.1 design). Glenn Jones seems to have experienced a similar thing (http://www.mail-archive.com/casper@lists.berkeley.edu/msg00358.html), but the maillist archive hasn't led me to any solutions so far. Does anyone have an explanation/fix/info? Any help appreciated, Jack
Re: [casper] Fwd: Re: SPDO ROACH spectrometer
Hi Jason, Thanks for your pointers; I am currently not actually using the FPGA. Just focusing on being able to talk to the powerpc reliably at the moment. The system does also crash when using NFS, but as I said and you noted; it is more difficult to trace them directly back to EMACS related kernel functions. It may very well be a secondary symptom of something else. Now your suggested versions for Uboot/CPLD/Monitor are interesting. We have two roach boards; the newer one that I have been testing is reporting U-Boot 2008.10-svn2157 (Jul 31 2009 - 17:15:22) ... Monitor Revision: 7.3.0 CPLD Revision:7.5.6 Whereas the older Roach that Wan has been using reports U-Boot 2008.10-svn1923 (May 29 2009 - 17:22:43) ... Monitor Revision: 6.5.1429 CPLD Revision:2.0.5 Leaving this older one aside for reference for now, I have upgraded the U-boot image on the newer roach to 20090807-uboot-nohack.bin, which is actually from revision 2212, but seemed to be the closest to the suggested revision I could find without compiling the image myself. I was unsuccessfully looking around for how to upgrade the CPLD/Monitor. Would you be able to point me in the right direction? I'll test for any improvements with the new uboot now. Thanks again Kjetil Jason Manley wrote: Also, make sure you're running newwer versions of uboot and the CPLD image. Bus settings changed some months back and improved stability significantly. Uboot will report the versions, and I recommend: U-Boot 2008.10-svn2226 (Aug 7 2009 - 16:06:44) ... Monitor Revision: 8.3.1698 CPLD Revision:8.1.0 at the very least, you should have CPLD Revision 8.0.1588. The only outstanding bug that regularly affects me is that u-boot sometimes doesn't detect the PPC's SDRAM on startup. The system then hangs. Replacing the DIMM with registered memory (same as FPGA DIMM) apparently fixes this. Jason On 04 Nov 2009, at 07:56, Kjetil Wormnes wrote: Hi all, For reference I've attached a summary of our problems below, and a few things I have attempted to do to isolate it. The short of it is that we are unable to transfer large amounts of data across the ethernet reliably regardless of; --kernel version --whether we are usb mount or nfs mount root file system. --network protocol used for transfer The way the crash happens varies, and is not repeatable. Sometime it seems to be a userspace crash, sometimes it is a kernel panic. I have been unable to see any real pattern in the crash reports. This to me seems to indicate that the root cause of the problem may be common, and either an obscure kernel problem or possibly something in the interface between the kernel and the hardware or in the hardware itself. It wouldn't be a big effort to re implement our software to run on a remote machine and talk to the ROACH over KATCP, rather than run locally on the ppc. But since it would require a complete rewrite of the software, we haven't tested this yet. Perhaps it is worth trying. The catch is that I am still really unsure whether we are dealing with many symptoms of the same problem; or many different problems. Anyway, I would like to thank you for all your input, and will let you know if and how we find a satisfactory solution. cheers Kjetil Here is the summary: *The problem* The system crashes when downloading large files. There appears to be varying causes for this crash that may or may not have a common underlying reason. I have attempted to isolate the problem by • Downloading using different protocols and software; ssh and two different ftp servers. • Mounting the filesystem over NFS as opposed to USB • Installing well-known and used kernels, and comparing to custom kernels. SSH SSH always crashes with “Invalid MAC on input” or related error messages. This appears to be a problem with SSH. *FTP* System instabilities were observed using two different ftp servers; proftpd and pure-ftpd. In the best case, with pure-ftpd was able to download 2-3 files, each of size about 2GB before system crashing. Looking through the call stack seemed to indicate that the crash happened in EMAC interface functions. (ie ethernet). However, we have no way of knowing whether these crashes are in fact rather side-effects of the USB subsystem misbehaving. Jason from the Casper mailing list has once again reconfirmed that USB on powerpcs is notoriously unreliable. *DIFFERENT KERNELS - DIFFERENT PROBLEMS* Using some kernels (the latest) saw the link unable to come up at all, while both a custom compiled older kernel (a couple of months ago) and a downloaded image, uImage-20091006-mmcfix both saw the link come up, but with all the crashes described. *ELIMINATING USB AS A CAUSE* To eliminate the effects of USB, I mounted the root filesystem remotely using NFS. I make a few observations; *SSH* Still dies from time to time with the Invalid MAC error message. This was expected as we have already pretty much determined that this error is
Re: [casper] Fwd: Re: SPDO ROACH spectrometer
Hi Jason, Hi all. I think it would be very cool if someone who knows could make a wiki page to tell us what the suggested set of cpld/uboot/linux codes are, and if the suggested versions are different for different purposes. There's getting to be quite a few choices in the archive. We are firing up our ROACH development and I would like to start out with the most stable set of firmware I can. Thanks John Thanks for your pointers; I am currently not actually using the FPGA. Just focusing on being able to talk to the powerpc reliably at the moment. The system does also crash when using NFS, but as I said and you noted; it is more difficult to trace them directly back to EMACS related kernel functions. It may very well be a secondary symptom of something else. Now your suggested versions for Uboot/CPLD/Monitor are interesting. We have two roach boards; the newer one that I have been testing is reporting U-Boot 2008.10-svn2157 (Jul 31 2009 - 17:15:22) ... Monitor Revision: 7.3.0 CPLD Revision:7.5.6 Whereas the older Roach that Wan has been using reports U-Boot 2008.10-svn1923 (May 29 2009 - 17:22:43) ... Monitor Revision: 6.5.1429 CPLD Revision:2.0.5 Leaving this older one aside for reference for now, I have upgraded the U-boot image on the newer roach to 20090807-uboot-nohack.bin, which is actually from revision 2212, but seemed to be the closest to the suggested revision I could find without compiling the image myself. I was unsuccessfully looking around for how to upgrade the CPLD/Monitor. Would you be able to point me in the right direction? I'll test for any improvements with the new uboot now. Thanks again Kjetil Jason Manley wrote: Also, make sure you're running newwer versions of uboot and the CPLD image. Bus settings changed some months back and improved stability significantly. Uboot will report the versions, and I recommend: U-Boot 2008.10-svn2226 (Aug 7 2009 - 16:06:44) ... Monitor Revision: 8.3.1698 CPLD Revision:8.1.0 at the very least, you should have CPLD Revision 8.0.1588. The only outstanding bug that regularly affects me is that u-boot sometimes doesn't detect the PPC's SDRAM on startup. The system then hangs. Replacing the DIMM with registered memory (same as FPGA DIMM) apparently fixes this. Jason On 04 Nov 2009, at 07:56, Kjetil Wormnes wrote: Hi all, For reference I've attached a summary of our problems below, and a few things I have attempted to do to isolate it. The short of it is that we are unable to transfer large amounts of data across the ethernet reliably regardless of; --kernel version --whether we are usb mount or nfs mount root file system. --network protocol used for transfer The way the crash happens varies, and is not repeatable. Sometime it seems to be a userspace crash, sometimes it is a kernel panic. I have been unable to see any real pattern in the crash reports. This to me seems to indicate that the root cause of the problem may be common, and either an obscure kernel problem or possibly something in the interface between the kernel and the hardware or in the hardware itself. It wouldn't be a big effort to re implement our software to run on a remote machine and talk to the ROACH over KATCP, rather than run locally on the ppc. But since it would require a complete rewrite of the software, we haven't tested this yet. Perhaps it is worth trying. The catch is that I am still really unsure whether we are dealing with many symptoms of the same problem; or many different problems. Anyway, I would like to thank you for all your input, and will let you know if and how we find a satisfactory solution. cheers Kjetil Here is the summary: *The problem* The system crashes when downloading large files. There appears to be varying causes for this crash that may or may not have a common underlying reason. I have attempted to isolate the problem by Downloading using different protocols and software; ssh and two different ftp servers. Mounting the filesystem over NFS as opposed to USB Installing well-known and used kernels, and comparing to custom kernels. SSH SSH always crashes with Invalid MAC on input or related error messages. This appears to be a problem with SSH. *FTP* System instabilities were observed using two different ftp servers; proftpd and pure-ftpd. In the best case, with pure-ftpd was able to download 2-3 files, each of size about 2GB before system crashing. Looking through the call stack seemed to indicate that the crash happened in EMAC interface functions. (ie ethernet). However, we have no way of knowing whether these crashes are in fact rather side-effects of the USB subsystem misbehaving. Jason from the Casper mailing list has once again reconfirmed that USB on powerpcs is notoriously unreliable. *DIFFERENT KERNELS - DIFFERENT PROBLEMS* Using some kernels (the latest) saw the link unable to come up at
Re: [casper] Fwd: Re: SPDO ROACH spectrometer
Hi Jason: As Kjetil mentioned, we are only working on OS. The OS crash and network problem appears without our program running. Cheers Wan -Original Message- From: Jason Manley [mailto:jasonman...@gmail.com] Sent: Wednesday, 4 November 2009 7:22 PM To: Wormnes, Kjetil (ATNF, Marsfield) Cc: Marc Welz; David George; Cheng, Wan (ATNF, Marsfield); casper@lists.berkeley.edu Subject: Re: [casper] Fwd: Re: SPDO ROACH spectrometer As you say, NFS mounts work correctly which would indicate that the network is operating as expected. WRT other errors, are you certain that all reads/writes on FPGA are on 32-bit boundaries? Byte-sized and 16-bit reads are supposed to work, but we have found that for some reason they sometimes cause crashes. It doesn't break immediately, but on subsequent bus transactions. These manifest as kernel crashes. Jason On 04 Nov 2009, at 07:56, Kjetil Wormnes wrote: Hi all, For reference I've attached a summary of our problems below, and a few things I have attempted to do to isolate it. The short of it is that we are unable to transfer large amounts of data across the ethernet reliably regardless of; --kernel version --whether we are usb mount or nfs mount root file system. --network protocol used for transfer The way the crash happens varies, and is not repeatable. Sometime it seems to be a userspace crash, sometimes it is a kernel panic. I have been unable to see any real pattern in the crash reports. This to me seems to indicate that the root cause of the problem may be common, and either an obscure kernel problem or possibly something in the interface between the kernel and the hardware or in the hardware itself. It wouldn't be a big effort to re implement our software to run on a remote machine and talk to the ROACH over KATCP, rather than run locally on the ppc. But since it would require a complete rewrite of the software, we haven't tested this yet. Perhaps it is worth trying. The catch is that I am still really unsure whether we are dealing with many symptoms of the same problem; or many different problems. Anyway, I would like to thank you for all your input, and will let you know if and how we find a satisfactory solution. cheers Kjetil Here is the summary: *The problem* The system crashes when downloading large files. There appears to be varying causes for this crash that may or may not have a common underlying reason. I have attempted to isolate the problem by * Downloading using different protocols and software; ssh and two different ftp servers. * Mounting the filesystem over NFS as opposed to USB * Installing well-known and used kernels, and comparing to custom kernels. SSH SSH always crashes with Invalid MAC on input or related error messages. This appears to be a problem with SSH. *FTP* System instabilities were observed using two different ftp servers; proftpd and pure-ftpd. In the best case, with pure-ftpd was able to download 2-3 files, each of size about 2GB before system crashing. Looking through the call stack seemed to indicate that the crash happened in EMAC interface functions. (ie ethernet). However, we have no way of knowing whether these crashes are in fact rather side-effects of the USB subsystem misbehaving. Jason from the Casper mailing list has once again reconfirmed that USB on powerpcs is notoriously unreliable. *DIFFERENT KERNELS - DIFFERENT PROBLEMS* Using some kernels (the latest) saw the link unable to come up at all, while both a custom compiled older kernel (a couple of months ago) and a downloaded image, uImage-20091006-mmcfix both saw the link come up, but with all the crashes described. *ELIMINATING USB AS A CAUSE* To eliminate the effects of USB, I mounted the root filesystem remotely using NFS. I make a few observations; *SSH* Still dies from time to time with the Invalid MAC error message. This was expected as we have already pretty much determined that this error is ssh-specific and not related to our other worries. *ETHERNET* Comes up nicely. System mounts remotely and file access has not caused any obvious problems. In fact I have not really had any problems that I can trace directly back to the Ethernet. That being said, the systems seems to crash after a little while with this setup also. The error messages have been varying. Only once has it been a kernel crash, and then, looking at the call stack it no longer appears to crash inside EMAC access functions. The download speeds seem quite variable; but this is probably more likely due to the network since the operating system is over NFS than the ROACH board itself. Jason Manley wrote: Marc Welz or David George built that kernel. They are the best people to ask about this. I've cc'd them, though I'm not sure either would have the config file from that release. It might be
Re: [casper] Fwd: Re: SPDO ROACH spectrometer
Hi Jason: Thanks for you help. But I could not find the tut4 in workshop. Could you please provide me the exact link? Thanks. As I know, the KATCP only provide data in ASCII. Is this right? And for KATCP, all command and data are transferred by network. But we expeirence some network failure without running our own program. I guess it might not be a good idea to run our application over network at the moment. For my experience with KATCP failure, I think it belong two cases, one is OS could crash when I run bof file. Another is network connection could be broken when I access the data or registers. I can provide more details when I see them again. Anyway, thanks Jason for all information you provide. Cheers Wan -Original Message- From: Jason Manley [mailto:jasonman...@gmail.com] Sent: Wednesday, 4 November 2009 4:35 PM To: Cheng, Wan (ATNF, Marsfield) Cc: Wormnes, Kjetil (ATNF, Marsfield); m...@ska.ac.za; david.geo...@ska.ac.za; casper@lists.berkeley.edu Subject: Re: [casper] Fwd: Re: SPDO ROACH spectrometer comments appended below... On 04 Nov 2009, at 01:04, wan.ch...@csiro.au wan.ch...@csiro.au wrote: Hi Jason: Could you please let me know how do you get data using KATCP? Look at the wideband poco example (tut4) from the workshop. It is a fully-functional correlator on a single ROACH board, and includes python scripts which demonstrate the use of KATCP. Pulling data from DRAM is the same as retrieving it from BRAM or QDR. I also used KATCP for a while. But I can not find an efficient way to read a large mount data from FPGA, like 1GB data stored in the Dram. This is possible without any complicated trickery using KATCP. Expect data rates of about 10MB/s if you implement a reasonable form of hardware/software handshaking. And there are a few difficulties as well. Such as, network is not reliable, OS crashed when I download bof file sometimes. This should never happen. Please provide details. Jason -Original Message- From: Jason Manley [mailto:jasonman...@gmail.com] Sent: Tuesday, 3 November 2009 5:40 PM To: Wormnes, Kjetil (ATNF, Marsfield); Marc Welz; David George Cc: Cheng, Wan (ATNF, Marsfield); casper@lists.berkeley.edu Subject: Re: [casper] Fwd: Re: SPDO ROACH spectrometer Marc Welz or David George built that kernel. They are the best people to ask about this. I've cc'd them, though I'm not sure either would have the config file from that release. It might be easiest to checkout an older svn version. Might I suggest that instead of recording data to a USB HDD, that you rather record it across the network to another computer? If you don't want to use KATCP for dumping the data directly from your FPGA, you can always mount an NFS network share on your ROACH and record the data there. The USB on the PPC platforms are notoriously unreliable. Jason On 03 Nov 2009, at 03:05, Kjetil Wormnes wrote: Hi Jason, Thank you again for your reply. I can use FTP or even write my own little raw socket transfer routine, and it seems to work, I can transfer a few gigabyte-size files. However, at the end of this, the other problem kicks in; causing a system crash. I believe this is a kernel problem, as it exhibits itself differently with different kernels I have tried. So, putting the ssh problem aside as something that we can work around and returning to the other request I made; I am compiling my own kernel because I seem to need to in order to get EHCI and EXT3 to work properly. However, when I do, EMAC can't autonegotiate a link, and even forcing it to something doesn't work. The link comes up, then drops out again... repeatedly. The interesting thing is this problem *does not* occur when I compile my kernel using an svn checkout from a couple of months ago. Even with the exact same .config file. At least this is the case as far as I can tell. Now, in order to be 100% sure that it is in fact a difference in the source that is causing this problem, rather than just the .config. I would love it if you could send me the .config file used to compile the uImage-20091006-mmcfix kernel. The ethernet interface does appear to be more stable with that kernel, but unfortunately I can't use it as it doesn't allow USB 2.0 speeds, so if you please, the .config file would be very useful. Thanks again for all your help Kjetil Jason Manley wrote: There appears to be some issue with ssh on ROACH with large transfers. It is definitely not a hardware problem as other network transfers work fine. Both Andrew Martens and myself regularly transfer large amounts of data (1GB) using KATCP. This ssh bug has become a low priority for us as we concentrate on other things. If you do not want to try'n debug it yourself, I recommend you try an FTP server. Kjetil, you are correct; at present, KATCP does not support transfer of arbitrary files from filesystem. Jason On 02 Nov 2009, at 00:51, Kjetil
Re: [casper] Fwd: Re: SPDO ROACH spectrometer
Hi Jason: I guess the latest CPLD is quite important for us. But for Uboot, I am not sure. Will all PPC registers be re-initialized again in the linux core? Or the linux core use default value initialized by Uboot? Will Uboot load OS into Dram before set the program pointer to the OS start address? Or OS can load itself into the Dram? Thanks Wan -Original Message- From: Jason Manley [mailto:jasonman...@gmail.com] Sent: Wednesday, 4 November 2009 7:40 PM To: Wormnes, Kjetil (ATNF, Marsfield) Cc: Marc Welz; David George; Cheng, Wan (ATNF, Marsfield); casper@lists.berkeley.edu Subject: Re: [casper] Fwd: Re: SPDO ROACH spectrometer Also, make sure you're running newwer versions of uboot and the CPLD image. Bus settings changed some months back and improved stability significantly. Uboot will report the versions, and I recommend: U-Boot 2008.10-svn2226 (Aug 7 2009 - 16:06:44) ... Monitor Revision: 8.3.1698 CPLD Revision:8.1.0 at the very least, you should have CPLD Revision 8.0.1588. The only outstanding bug that regularly affects me is that u-boot sometimes doesn't detect the PPC's SDRAM on startup. The system then hangs. Replacing the DIMM with registered memory (same as FPGA DIMM) apparently fixes this. Jason On 04 Nov 2009, at 07:56, Kjetil Wormnes wrote: Hi all, For reference I've attached a summary of our problems below, and a few things I have attempted to do to isolate it. The short of it is that we are unable to transfer large amounts of data across the ethernet reliably regardless of; --kernel version --whether we are usb mount or nfs mount root file system. --network protocol used for transfer The way the crash happens varies, and is not repeatable. Sometime it seems to be a userspace crash, sometimes it is a kernel panic. I have been unable to see any real pattern in the crash reports. This to me seems to indicate that the root cause of the problem may be common, and either an obscure kernel problem or possibly something in the interface between the kernel and the hardware or in the hardware itself. It wouldn't be a big effort to re implement our software to run on a remote machine and talk to the ROACH over KATCP, rather than run locally on the ppc. But since it would require a complete rewrite of the software, we haven't tested this yet. Perhaps it is worth trying. The catch is that I am still really unsure whether we are dealing with many symptoms of the same problem; or many different problems. Anyway, I would like to thank you for all your input, and will let you know if and how we find a satisfactory solution. cheers Kjetil Here is the summary: *The problem* The system crashes when downloading large files. There appears to be varying causes for this crash that may or may not have a common underlying reason. I have attempted to isolate the problem by * Downloading using different protocols and software; ssh and two different ftp servers. * Mounting the filesystem over NFS as opposed to USB * Installing well-known and used kernels, and comparing to custom kernels. SSH SSH always crashes with Invalid MAC on input or related error messages. This appears to be a problem with SSH. *FTP* System instabilities were observed using two different ftp servers; proftpd and pure-ftpd. In the best case, with pure-ftpd was able to download 2-3 files, each of size about 2GB before system crashing. Looking through the call stack seemed to indicate that the crash happened in EMAC interface functions. (ie ethernet). However, we have no way of knowing whether these crashes are in fact rather side-effects of the USB subsystem misbehaving. Jason from the Casper mailing list has once again reconfirmed that USB on powerpcs is notoriously unreliable. *DIFFERENT KERNELS - DIFFERENT PROBLEMS* Using some kernels (the latest) saw the link unable to come up at all, while both a custom compiled older kernel (a couple of months ago) and a downloaded image, uImage-20091006-mmcfix both saw the link come up, but with all the crashes described. *ELIMINATING USB AS A CAUSE* To eliminate the effects of USB, I mounted the root filesystem remotely using NFS. I make a few observations; *SSH* Still dies from time to time with the Invalid MAC error message. This was expected as we have already pretty much determined that this error is ssh-specific and not related to our other worries. *ETHERNET* Comes up nicely. System mounts remotely and file access has not caused any obvious problems. In fact I have not really had any problems that I can trace directly back to the Ethernet. That being said, the systems seems to crash after a little while with this setup also. The error messages have been varying. Only once has it been a kernel crash, and then, looking at the call stack it no longer appears to crash inside EMAC access functions. The download speeds seem quite variable; but this is probably more likely due to the
[casper] xlUpdateModel
Hi all. I'm trying to port a rather complex model to 10.1, and I had hoped that xlUpdateModel would allow me to do it rather easily, but when I run it Matlab crashes. It seems to choke on the gavrt library's vacc module. Has anyone gotten this to work, or should I forget it and just redraw the model? John
Re: [casper] xlUpdateModel
John, I think the shortcomings of xlUpdateModel are what made the transition from 7.1 to 10.1 so painful. Dynamically drawn blocks like the vacc will not be handled correctly in general. Therefore, I think it will be much easier and reliable to simply redraw the diagram block for block in 10.1. Glenn On Wed, Nov 4, 2009 at 4:12 PM, John Ford jf...@nrao.edu wrote: Hi all. I'm trying to port a rather complex model to 10.1, and I had hoped that xlUpdateModel would allow me to do it rather easily, but when I run it Matlab crashes. It seems to choke on the gavrt library's vacc module. Has anyone gotten this to work, or should I forget it and just redraw the model? John