I'm having a slightly strange behavior with PPC440GX ethernet, I'm looking for a little advice where I can poke around to see what's going on.
I have a custom 440GX board, I use the two RGMII gigabit interfaces to two Vistesse PHYs. This works nicely. The board has a large FPGA signal processor that is DMA'ing data into main memory, the PPC sends data from main memory out the ethernet interfaces. This all works well. For testing purposes I'm DMA'ing a pseudo random sequence at 80MB/s, sending this over ethernet on a TCP socket to a server machine and checking the sequence at the receiving end. So far so good. Runs for days on several prototype machines. As part of the DMA diagnostic program I keep track of the maximum occupied capacity of the main memory ring buffer holding data from my FPGA device driver. This lets me keep track of how close I get to a buffer overflow seeing as I'm running the gigabit ethernet port close to the edge at 80MB/s. The ring buffer will typically reach a maximum level of 512kB. This is how far the network connection gets behind the realtime DMA from the FPGAs. Here's the weird part. On one of the four prototype boxes, if I plug the second ethernet port into gigabit switch and get a link light (2nd interface is not enabled under linux), the DMA behavior will change and I can see the ring buffer get as large as 25MB (up from 512kB!) Only one of my four boxes shows this strange behavior, and only when the second ethernet port is connected to an ethernet switch. Everything still works properly, my 80MB/s pseudo random sequence is still generated by the FPGAs and checked by a server on the other end of the network connection. I let the ring buffer get as large as 64MB before failing, but the large ring buffer says that the network connection sometimes gets as much as 25MB behind the FPGA DMA, or 25/80 = 0.3125 seconds, which seems kind of crazy. I look at "ifconfig" (busybox ifconfig) and I see no errors on the ethernet interface. I'm guessing there might be some design problem or maybe just a problem with this one particular board that is causing errors that occasionally slows down the TCP connection, perhaps crosstalk between the two RGMII interfaces or maybe some interaction between the magnetics on the two ports, but I can't figure out where to look to measure errors on the physical ethernet interface. Can someone give me a hint about where to look for this problem? This is a 2.6.15 kernel. Thanks for reading, I went on a bit long... jeff _______________________________________________ Linuxppc-embedded mailing list [email protected] https://ozlabs.org/mailman/listinfo/linuxppc-embedded
