Eric Ulmer <[email protected]> writes:

> I Also enabled more verbose logging and got this output in 
> /tmp/pvfs2-client.log
> (/usr/local/bin/pvfs2-client --gossip-mask=verbose -p 
> /usr/local/bin/pvfs2-client-core)
>
> [D 16:11:17.141796] Got an fs mount request for host:
>   ib://esekilx4055481:3335/pvfs2-fs
> [D 16:11:17.141803] Using Mount Point <DYNAMIC-1>
> [D 16:11:17.141808] Got Configuration Server: ib://esekilx4055481:3335 
> (len=24)
> [D 16:11:17.141813] Got FS Name: pvfs2-fs (len=8)
> [D 16:11:17.141819] BMI_addr_lookup: ib://esekilx4055481:3335
> [D 16:11:17.141824]     addr not found, go to methods
> [E 16:11:17.141829] PVFS_isys_fs_add: Failed to initialize any appropriate 
> BMI methods for addresses:
> [E 16:11:17.141834]     ib://esekilx4055481:3335
> [E 16:11:17.141841] Posting fs_add failed: Protocol not available

I think it may be what I was about to post about.  I can reproduce that
sort of thing with the default locked memory limit in RH6, which
presumably defeats openib pinning.  You need to change it for non-root
users in limits.conf; 2048 seems to be high enough.

I was going to ask if it's possible to make the failure more graceful.
As well as the sort of error above being rather confusing, for instance
when openmpi is built with pvfs2 support, ompi_info just hangs like
this:

  $ ompi_info --all
  ompi_info: : Unknown error 18446744071712104656
  ^C

although I don't know whether that's really an openmpi/romio problem.
Obviously jobs on the cluster are set up suitably, but there's no
particular reason for sessions on the login node to have it normally..

Actually, it's even more confusing in that if I take ulimit -l up to
1024, instead of a traceback as above, I get a misleading

  $ pvfs2-ping -m /pvfs2
  
  (1) Parsing tab file...
  
  (2) Initializing system interface...
  
  (3) Initializing each file system found in tab file: /etc/fstab...
  
     PVFS2 servers: ib://nfs-server-ib:3335
     Storage name: pvfs2-fs
     Local mount point: /pvfs
     /pvfs: Ok
  
  (4) Searching for /pvfs2 in pvfstab...
  Failure: could not find filesystem for /pvfs2 in pvfs2tab /etc/fstab
  Entry 0: /pvfs

or a hang:

  $ pvfs2-ls
  [E 22:44:18.914893] Warning: openib_mem_register: ibv_register_mr.
  ^C

It's probably worth documenting this unless I missed it.

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to