Mikhail T. wrote: > I was copying /home from an old server (narawntapu) to a new one > (aldan). The narawntapu:/home is mounted on aldan as /mnt with flags > ro,intr. On narawntapu /home was simply located on an SSD, but on aldan > I created a ZFS filesystem for it. > > The copying was started thus: > > root@aldan:/home (435) cp -Rpn /mnt/* . > > for a while this was proceeding at a decent clip with cp making > newnfsreq-uests: > > load: 0.78 cmd: cp 38711 [newnfsreq] 802.84r 1.57u 140.63s 20% 10768k > > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > -> > > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > 100% > load: 1.23 cmd: cp 38711 [newnfsreq] 874.19r 1.66u 154.74s 17% 4576k > > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > -> > > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > 100% > > ZFS on the destination compressing and writing stuff out and the traffic > between the two ranging from 30 to 50Mb/s (according to systat), but > then something happened and the cp-process is now hung: > > load: 0.55 cmd: cp 38711 [fifoor] 1107.67r 2.09u 194.12s 0% 3300k > load: 0.50 cmd: cp 38711 [fifoor] 1112.66r 2.09u 194.12s 0% 3300k > load: 0.22 cmd: cp 38711 [fifoor] 1642.37r 2.09u 194.12s 0% 3300k > Doing `ps axHl` will show you what the ``cp`` process is stuck on (WCHAN). If it is down inside ZFS, then I suspect it is ZFS resource related. If it is stuck somewhere in NFS or the kernel RPC, then I`d suspect a net driver issue: - The number 1 issue for net drivers vs NFS is TSO, so disabling TSO is the first thing to try (if the processes aren`t stuck inside zfs). (In the machine that is sending data, since it is a transmit segment limit problem. If I understood what you were doing, that would be the NFS server, but I`d disable it on both server and client.) - If that doesn`t fix it, try rsize=32768,wsize=32768 mount options for the NFS mount.
These TSO issues are slowly getting resolved, but some drivers may still be broken, especially if you aren`t running head. (For example, only a very recent em(4) driver is fixed.) You can also do things like `netstat -m` to look for mbuf cluster exhaustion and look at the stats for your net driver (usually a sysctl). Good luck with it, rick > There is nothing in the logs on the new system, but the old one has a > number of entries like: > > Nov 28 10:28:45 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (62 occurrences) > Nov 28 10:29:45 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (50 occurrences) > Nov 28 10:30:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (59 occurrences) > Nov 28 10:31:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (57 occurrences) > Nov 28 10:32:46 narawntapu kernel: sonewconn: pcb > 0xfffff80086231930: Listen queue overflow: 8 already in queue > awaiting acceptance (68 occurrences) > > Both systems are largely idle now. I'm not in a hurry -- is anybody > interested in investigating it in situ? What is "fifoor" -- does this > point to a trouble in the ZFS, the NFS-client, or the NFS-server? Both > systems run FreeBSD/amd64 of recent 10.x-vintage. > > Thanks! > > -mi > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"