On Mon, May 14, 2012 at 12:44:48PM -0400, Becky Ligon wrote:
> Andrew:
> 
> You have given us a lot to chew on, but it sounds like the kernel module is
> having problems.  I'm not familiar with parallel make.  Does it use MPI?

No MPI.  He's referring to the 'make' feature where you can spawn many
processes to work on the dependency tree.   so 'make -j10' says
"whenever make finds independent targets, spawn up to 10 instances to
work on them. 

==rob

> Becky
> 
> On Sat, May 12, 2012 at 9:17 PM, Andrew Savchenko <[email protected]> wrote:
> 
> > Hello,
> >
> > During some testing I found that orangefs behaves badly when multiple
> > intense parallel i/o is used on the same directory. For testing I
> > used parallel make: just untar some relatively large tarball and run
> > make -j10
> > I used torque-3.0.5, but this should not matter.
> >
> > My current setup is: orangefs-2.8.5, 15 servers serving both data and
> > metadata, 16 clients, 15 of them are on the same nodes as servers,
> > this testing was conducted on a separated node with no servers on it.
> > Kernel is linux-3.2.14, ACL support is disabled due to previously
> > found bugs:
> >
> > http://www.beowulf-underground.org/pipermail/pvfs2-developers/2012-April/004974.html
> > I use TroveSync disabled.
> >
> > During parallel make random files (rarely directories) become
> > inaccessible, any attempt to use them results in EIO (system error 5,
> > input/output error). However, these files can be normally accessed
> > from other nodes or even from the same node using pvfs2-cp, which
> > doesn't use kernel VFS to my knowledge.
> >
> > I made a series of tests to find what may affect this behaviour and
> > found that:
> >
> > 1) Error rate depends on parallelism level: make -j2 is often fine,
> > -j5 produces more problems, -j10 tends to "generate" broken files
> > very often and so on.
> >
> > 2) With client-side caching disabled (defaults are -a5 -n5):
> > pvfs2-client -a 0 -n 0 ...
> > things became worse: frequency of error occurrence raised
> > significantly. Somewhat large cache (-a10 -n10) seems to work better,
> > but doesn't eliminate problem completely.
> >
> > 3) During such tests I found that sometimes kernel produce backtraces
> > and complains about NULL pointer dereference. See attached kernel.log
> > for details. pvfs2-client complains a lot in its log via the same
> > message:
> > [E 09:49:22.580278] Completed upcall of unknown type ff00000d!
> > Though, it is not strictly in sync with kernel backtraces.
> >
> > 4) When I tried to increase client cache significatly (-a500 -b500)
> > and run make -j10, I got kernel crash, all disk subsystem (not only
> > pvfs2) became unresponsive and only hardware watchdog save the
> > situation. This was general protection fault. I managed to saved
> > kernel trace, see kernel.crash.log.
> >
> > 5) There are no errors logged on the pvfs2 servers.
> >
> > 6) TroveSyncMeta yes has no noticeable effect on this issue.
> >
> > 7) TroveSyncData yes makes it somewhat better in one cases and worse
> > in another.
> >
> > 8) I tried to increase AttrCacheSize and AttrCacheMaxNumElems values,
> > though with no effect. Nevertheless I plan to keep larger values,
> > they shouldn't hurt and we have a plenty of RAM available.
> >
> > My current pvfs config as attached for reference.
> >
> > As for now I can somewhat mitigate this issue by using a cron
> > script with either mount -o remount on nodes with problems (though
> > remount with live applications may produce problems itself) or by
> > using the following sequence:
> > pvfs2-cp badfile tempfile
> > pvfs2-rm badfile
> > cp tempfile badfile
> >
> > But anyway this will not help with already confused applications...
> >
> > I'm aware that support for 3.1 and 3.2 kernels is still experimental,
> > but I can't downgrade this system because other applications require
> > some new kernel features.
> >
> > Also I found an interesting options: DBCacheSizeBytes and
> > DBCacheType, though as far as I understand they have effect only on
> > TroveMethod dbpf and are useless for alt-aio used in my setup.
> > Please correct me if I'm wrong.
> >
> > Best regards,
> > Andrew Savchenko
> >
> > _______________________________________________
> > Pvfs2-users mailing list
> > [email protected]
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
> >
> >
> 
> 

> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users


-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to