I'm having an extremely puzzling problem with my attempted PVFS setup right now. Here's the intended setup: three computers run the pvfs cluster, named pot, crystal, and meth. Pot holds the metadata, while crystal and meth hold data and are both connected to two gigabit ethernet ports. In effect, the network sees five computers: pot, meth0, meth1, crystal0, crystal1. I'm running two pvfs2-server instances each on meth and crystal, one for each gigabit port.
In my first attempt, I just assigned meth0 and meth1 two different ports. PVFS came up, and I could write files to it -- it was successful. However, if I transferred a large file, it was clear that it only used a single port each on crystal and meth, as seen in /proc/net/dev. For my second attempt, I tried enabling TCPBindSpecific, and using the same port. As I understand, this forces pvfs to only accept data over the port to which it is assigned. Now, the really strange problem: meth1, crystal0, and crystal1 are all fine and accessible. meth0 is not. I've double checked that all the config files are the same. I've tried launching the meth0 server instance first after a reboot, but meth0 still refuses connections. All firewalls are off, and the logs on meth show nothing. The logs on pot are filled with "Warning: msgpair failed to tcp://meth0:3334, will retry: Connection refused". My ultimate goal is to be able to use the full 4 gbps of bandwidth connected to the file servers. Any ideas? Thanks for you time! -James
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
