Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hi Roman: Sorry for the long delay, I had to fix some other stuff first, before I could launch the test... Here is just a short intermediate result. Am 04.02.10 20:35 schrieb(en) Albrecht Dreß: Actually, I forgot that I have to explicitly enable libata dma on the 5200b, due to the known silicon bugs... I will repeat my tests with the proper configuration, stay tuned. ... a signal processor attached to the localbus, using bestcomm and the fifo for the bulk transfer Are you using an own driver, or are you using Grant's SCLPC+SDMA driver? BD task? Basically Grant's driver, but with a slightly modified variant of the gen_bd task. The signal processor is a LE, and I managed to insert the LE/BE conversion into the bestcomm task (see also http://patchwork.ozlabs.org/patch/35038/). Unfortunately, there is no good documentation of the engine; I would like to also shift crc calculation into bestcomm, which seems to be possible in principle, but I never got it running. The best thing is to run very ugly tests with very high load for at least 24h. I today launched my test application, on kernel 2.6.32 with a few minor tweaks, which runs 4 threads in parallel, all first writing a number of data blocks, then doing a sync() when appropriate, and reading reading them all back and checking the contents (md5 hash): - one writes/reads back 256 files of 256k each to a nfs3 share on a Xeon server, using a 100 MBit line; - one writes/reads back one 1 MByte block using BestComm to a Localbus device (see quote above); - two write/read back 128 files of 64k each to two CF cards w/ vfat, both attached to the ata (master/slave). Booting with 'libata.force=mwdma2', this tests reproducibly freezes the system *within a few minutes*, in one case leaving the vfat fs on one card completely broken. The system didn't throw a panic, it was always simply stuck - no response to the serial console, nothing. Booting *without* this option (i.e. using pio for the cf cards), the system seems to run flawlessly. I will continue the test over the weekend (now active for ~5 hours), but it looks as if I can reproduce your problem. Next week, I'll try your fix (hope I don't wear out the cf cards...), and re-run the test. Best, Albrecht. pgpXPDVyEnoYv.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hi Roman: Am 03.02.10 07:16 schrieb(en) Roman Fietze: Sorry for the delay ... your mail got stuck in a Notes spam filter. Never mind. I didn't know yet that I'm *such* a nasty guy... ;-) Are you using MWDMA2 with the compact flash cards? What is the load on the different (DMA) channels? ATA reads or writes? Actually, I forgot that I have to explicitly enable libata dma on the 5200b, due to the known silicon bugs... I will repeat my tests with the proper configuration, stay tuned. ... a signal processor attached to the localbus, using bestcomm and the fifo for the bulk transfer Are you using an own driver, or are you using Grant's SCLPC+SDMA driver? BD task? Basically Grant's driver, but with a slightly modified variant of the gen_bd task. The signal processor is a LE, and I managed to insert the LE/BE conversion into the bestcomm task (see also http://patchwork.ozlabs.org/patch/35038/). Unfortunately, there is no good documentation of the engine; I would like to also shift crc calculation into bestcomm, which seems to be possible in principle, but I never got it running. The best thing is to run very ugly tests with very high load for at least 24h. Thanks again for this tip! I hope I manage to run a test over the weekend. Throughput onto the cf cards is not critical for me (so I could live with pio there), but I'm a little afraid I might also see similar effects with fec and the signal processor (in particular, the latter *is* critical). Thanks, Albrecht. pgpYvbtKoluWE.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hello Albrecht, Sorry for the delay ... your mail got stuck in a Notes spam filter. On Friday 22 January 2010 21:11:39 Albrecht Dreß wrote: Are there any final conclusions from your tests? Yes. For the small product using the 2.6 kernel we turned all snooping and the kernel coherent flag off, which avoided crashes of the FEC and/or hard disk since we introduced that change. We are currently investigating problems that we are seeing on the 2.4.25 (DENX and Lite5200B based) boards for a long time. Here we are having problems with corrupt filesystems and FEC hick ups. In this case we are using UDMA2, because we cannot yet get MWDMA2 work on 2.4.25, well knowing that there might be a problem with UDMA2 and LPC. So we also turned of snooping and are currently in the testing phase (again). ... two compactflash cards with vfat file systems attached to the ata bus; - a nfs3 network drive, connected via a 100MBit line, on a Xeon serve Are you using MWDMA2 with the compact flash cards? What is the load on the different (DMA) channels? ATA reads or writes? ... a signal processor attached to the localbus, using bestcomm and the fifo for the bulk transfer Are you using an own driver, or are you using Grant's SCLPC+SDMA driver? BD task? Our latest product uses an SMSC MOST150 Spynic and an FPGA to sample data from a MOST ring via SCLPC+SDMA (single non BD task from the old Freescale Betstcomm API) on the 2.4.25. Here moving from memory accesses to SCLPC+SDMA helped somewhat, probably by avoiding the UDMA2/LPC problem by mainly letting the SDMA scheduler do the scheduling of the LPC traffic, which avoids the LPC arbiter problem somehow. The probability for seeing problems or crashes increases a lot with the bandwidth. I think, and I might be wrong, esp. when an arbiter or scheduler (LPC/PCI or SDMA) needs to switch users or tasks. In our case we have data running with about 3-6 MB/s (avg.) via the LPC to the hard disks or somewhat more using FTP from the hard disk to the FEC. I did not observe any issues The filesystem crashes are seldom, but happen often enough to be able to reproduce them once every 1 or 2 days under heavy load, and to produce failures in the field, what's even worse. And they statistically increased a lot wen we ran out of GPIO on the MPC5200B and then used an CPLD or FPGA to replace them, just a few bits to MUX SPI lines, but that was enough. but your statements are making me really nervous... That was not my intention. The best thing is to run very ugly tests with very high load for at least 24h. Due to the fact that we see those problems on different boards we (the SW guys) no longer can assume self made HW problems (HW guys), esp. when reading Freescale's advice with the XLB config. It might happen that we switch to 2.6 on our older products, hoping that at least the LPC/IDE problem disappears by using MWDMA2 instead of UDMA2. Roman -- Roman FietzeTelemotive AG Büro Mühlhausen Breitwiesen 73347 Mühlhausen Tel.: +49(0)7335/18493-45http://www.telemotive.de ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hi Roman: Am 16.12.09 12:37 schrieb(en) Roman Fietze: The board is using the MPC5200B on a board derived from the old lite5200, but with the fixes for the MPC5200B. All tests are run using an ST940813AM hard drive with an ext2 and an ext3 of 10GB each, default mkfs options. The OS is Debian 4.0. The network connection is between a fast Athlon XP2 6400 and the target, using 100MBit/s wiring and a 100MBit/s switch. Are there any final conclusions from your tests? I am using a board derived from the Lite5200b, running u-boot 2009.03 and a recent stock kernel, where I performed tests concurrently reading and writing data to and from - two compactflash cards with vfat file systems attached to the ata bus; - a nfs3 network drive, connected via a 100MBit line, on a Xeon serve; - a signal processor attached to the localbus, using bestcomm and the fifo for the bulk transfer. I did not observe any issues, but your statements are making me really nervous... Thanks, Albrecht. pgpfFd4kle3Nf.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Roman Fietze wrote: Hello Wolfram, On Wednesday 09 December 2009 15:57:48 Wolfram Sang wrote: Do you have a way to measure performance penalties? As I said, I do. And here they are. They won't win a price for the most impartial benchmarks ever seen, but thet'll be a good starting point to get a feeling what stability might cost. The board is using the MPC5200B on a board derived from the old lite5200, but with the fixes for the MPC5200B. All tests are run using an ST940813AM hard drive with an ext2 and an ext3 of 10GB each, default mkfs options. The OS is Debian 4.0. The network connection is between a fast Athlon XP2 6400 and the target, using 100MBit/s wiring and a 100MBit/s switch. The F always stands for fast settings, coherent cache, XLB features like snooping, etc. turned on, XLB config 0xa006 or 0x0001a006 (makes no or no big difference). The S always stands for slow settings, non coherent cache, XLB features like snooping, etc. turned on, XLB config 0x80012006. What disc access modes, (pio, mwdma or udma) did you use for these tests. Wolfgang. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hello Wolfgang, On Friday 18 December 2009 09:24:29 Wolfgang Grandegger wrote: What disc access modes, (pio, mwdma or udma) did you use for these tests. MWDMA2. On our hardware Linux 2.6 UDMA2 only works with very few disks. And on top of that, UDMA seems to have problems on the MPC5200B with concurrent traffic to or from the LPC (chips selects of peripherals while a DMA transfer is active). The same disks work without problems using UDMA2 on the old 2.4.25 on exctly the same HW. Why? I don't know yet. Roman -- Roman FietzeTelemotive AG Büro Mühlhausen Breitwiesen 73347 Mühlhausen Tel.: +49(0)7335/18493-45http://www.telemotive.de ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hello Wolfram, On Wednesday 09 December 2009 15:57:48 Wolfram Sang wrote: Do you have a way to measure performance penalties? As I said, I do. And here they are. They won't win a price for the most impartial benchmarks ever seen, but thet'll be a good starting point to get a feeling what stability might cost. The board is using the MPC5200B on a board derived from the old lite5200, but with the fixes for the MPC5200B. All tests are run using an ST940813AM hard drive with an ext2 and an ext3 of 10GB each, default mkfs options. The OS is Debian 4.0. The network connection is between a fast Athlon XP2 6400 and the target, using 100MBit/s wiring and a 100MBit/s switch. The F always stands for fast settings, coherent cache, XLB features like snooping, etc. turned on, XLB config 0xa006 or 0x0001a006 (makes no or no big difference). The S always stands for slow settings, non coherent cache, XLB features like snooping, etc. turned on, XLB config 0x80012006. Bonnie++ V1.03: --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP F ext2: 256M 2676 99 14759 47 5502 24 2792 98 16201 26 136.2 4 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 6485 99 84712 99 14517 10086 99 114078 100 309 99 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP S ext2: 256M 2647 99 14462 56 5468 29 2778 99 15856 32 134.2 4 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 6487 99 85872 100 15919 9988 99 114094 99 317 99 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP F ext3: 256M 2461 96 13327 73 5792 29 2798 98 16213 27 133.9 3 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 64 2826 92 59290 99 3803 71 2778 90 114706 100 3872 74 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP S ext3: 256M 2455 97 13340 85 5533 33 2759 98 15855 32 136.7 4 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 64 2774 92 59474 100 3810 74 2726 91 113937 100 3905 77 Netcat tests to or from my MPC5200B target. All times are the netcat time -p output real/user/sys and the network data rate in MB/s. The source data itself is either a dd from /dev/zero with bs=10M count=1024 or a file of equal size either on an ext2 or ext3 filesystem. The destination is either /dev/null or a file on an ext2 or ext3. Server:/dev/zero - Target:/dev/null F: 1594/15.3/1031 6.8 S: 1864/8.8/921 5.8 Target:/dev/zero - Server:/dev/null F: 1396/35.8/1361 7.7 S: 1578/51.5/1526 6.8 Server:/dev/zero - Target:ext2 F: 1799/37.0/1721 6.0 S: 2093/42.6/2009 5.1 Target:ext2 - Server:/dev/null F: 2423/31.3/1030 4.4 S: 2820/15.2/1186 3.8 Server:/dev/zero - Target:ext3 F: 2110/33.4/1912.88 5.1 S: 2397/47.5/2208.63 4.5 Target:ext3 - Server:/dev/null F: 2407/17.9/1016 4.5 S: 2676/15.4/1160 4.0 I repeated one or the other test and got comparable results. And please keep in mind, that adding some more or less high load on the SCLPC/LPC the F system crashed either right away or at least once every few hours, depending on what we did (TX/RX, FLASH/FPGA, SCLPC+DMA/SCLPC/CPU, ...), and depending on the concurrent load on the FEC and ATA. If you only need one or the other peripheral with your system you might get away with alls the XLB fetures turned on. Additionally to that, I got wrong data reading from FLASH or the FPGA using SCLPC+DMA in the first few bytes with the fast settings, when I did not manually flush or invalidate the cache prior to the DMA submission as dma_map_single would do it in non cache
MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hallo, I had a discussion with some Freescale support people about the constant RFIFO Event errors or PATA hard disk crashes when stressing a board with (very) high loads, in my case even with some additional traffic from an FPGA (or just the FLASH) via the SCLPC FIFO using the BestComm. The latest recommendation from Freescale was to set BSDIS and PLDIS and to clear SE, resulting in a XLB config register value of 0x80012006, at least for my MPC5200B PVR/SVR of 0x80822014/0x80110022. If I am correct, the current XLB config setting using the current U-Boot and kernel is 0xa006. In order to get cache coherency I had to set CONFIG_NOT_COHERENT_CACHE=y for the MPC5200B boards. Please correct me if I am wrong with this assumption. With this setup, XLB config and kernel config, I do no longer have any PATA crashes, I do no longer have any FEC RFIFO errors, and my SCLPC driver runs w/o problems, too. I would be interested in your opinion, maybe Wolfgang could make some comments, because he is involved in the U-Boot a lot as well. Here is the original text from the Freescale support person: -- snippety snip -- Dear Roman Fietze, In reply to your message regarding Service Request SR 1-597437219: Disable pipelining also. Test your software using the following setting of the XLB arbiter: 0x80012006. We have request from a customer with similar problem. Problem disappeared if XLB pipelining and BestComm snooping were disabled. Should you need to contact us with regard to this message, please see the notes below. Best Regards, Pavel Technical Support Freescale Semiconductor -- snappety snap -- Roman -- Roman FietzeTelemotive AG Büro Mühlhausen Breitwiesen 73347 Mühlhausen Tel.: +49(0)7335/18493-45http://www.telemotive.de ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hello Roman, I would be interested in your opinion, maybe Wolfgang could make some comments, because he is involved in the U-Boot a lot as well. Do you have a way to measure performance penalties? I know that stability comes before performance, still I am wondering as it looks to me that the most interesting features are simply switched off. Regrads, Wolfram -- Pengutronix e.K. | Wolfram Sang| Industrial Linux Solutions | http://www.pengutronix.de/ | signature.asc Description: Digital signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: MPC5200B XLB Configuration Issues, FEC RFIFO Events, ATA Crashes
Hello Wolfram, On Wednesday 09 December 2009 15:57:48 Wolfram Sang wrote: Do you have a way to measure performance penalties? Yes, I do. I am using an SCLPC test driver derived from a driver written or posted by Grant Likely named mpc5200-localplus-test. This gives me some useful output about the SCLPC BestComm FIFO read throughput. Additional to that I'm running I relatively dumb ATA stress test writing large files to an ext3 that gives me the data rate, as well as an NFS mount where I read data from. I will now run all three tests in a few combinations, those that do not crash my system with the old setup, adding NFS writes and ATA reads. I will post the numbers here as soon as I'm done, please give me 2 or 3 days, and as soon as I have the confidence they reflect the reality at least somewhat. I know that stability comes before performance, still I am wondering as it looks to me that the most interesting features are simply switched off. We are seeing the same problem in our device, but having a return rate of almost 100% due to corrupt file systems that can not be repaired in the field is no alternative. But, something that never happened before, the system is now running SCLPC read, ATA write and NFS read for more that 20 hours without any crash. That's an argument. Probably the XLB setup has to be done in the U-Boot anyway, and here the configuration can be flexible enough to enable those positive features on boards that use only component that do not conflict. The only thing left is the the cache coherency switch in the kernel config. Roman -- Roman FietzeTelemotive AG Büro Mühlhausen Breitwiesen 73347 Mühlhausen Tel.: +49(0)7335/18493-45http://www.telemotive.de ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev