[EMAIL PROTECTED] wrote on Mon, 21 Aug 2006 13:34 -0500:
> FYI I was finally cleared to upgrade my cluster to RHEL4 (2.6 kernel). 
> Unfortunately this doesn't look like it fixed my problem.  Doing any 
> operations on a pvfs2 filesystem over native infiniband (i.e. not tcp or 
> IPoIB) are extermely slow.  Just a simple "ls" on a pvfs2 filesystem 
> with a handful of files and directories takes 5-10 seconds and the 
> pvfs2-server process takes up 98% of the CPU.
> 
> Because of the operational demands of the users on this cluster I can't 
>  change the filesystems back from tcp to ib and get you some debug info 
> right this moment.  I'm hoping I can set up some playspace where I can 
> give you some more details later this week.

That's sad to hear.  There's code now in both VAPI and OpenIB
versions of pvfs/IB that spins for the first 10 ms waiting for a
message, then blocks using poll() until the NIC interrupts that it
sees something.

I'll be happy to look at debug traces again if you get a chance.
The sched_yield() that seemed to help your machine with the 2.4.21
kernel is commented out in src/io/bmi/bmi_ib/ib.c, because it is the
wrong thing to do, but we can use that to diagnose again.

                -- Pete
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to