The nuttx simulator running tcpecho doesn't appear to have any problem with large TCP sends or large number of large TCP sends. So the problem with tcpecho / tcpblaster on the SAMA5D36 seems to be in the SAMA5 code somewhere. The GMAC driver fix that I have only works when net logging is turned, and it tested it more thoroughly today. It definitely works. But I don't understand how it works, or why it doesn't work at high speeds.
If the TX packets are going into DMA memory but are are never sent, then there could only be a few things wrong: The MAC configuration, the DMA configuration, or the PHY setup. I would tend to think that the PHY setup would be the most suspicious. That driver (in its various forms on different Atmel parts) has been very well exercised over many years.
That doesn't mean it is error free, just not the first place I would look.