Thanks for the detailed reply, Kyle!

The reason I would like to set the stripe size on a per file basis is
because in our cluster, we want each node of the cluster to do some
computation on it's part of the data file (the striped part) and one of the
assumptions in our system is that each striped part is a multiple of the
record length of the file. And since we have to process different data
files, each with it's own record length ... it will be convenient for us to
specify the stripe size to be a multiple of the record length.

If it is absolutely certain that PVFS2 does not support striping on a
per-file basis for now, then I will need to see how we can adapt our system
to still work with PVFS in an efficient manner. Thanks again!


On Fri, Jan 23, 2009 at 3:09 PM, Kyle Schochenmaier <[email protected]>wrote:

> FSN -
>
> You are spot-on with your assesment of the striping in PVFS2.
> Each file gets a round-robin striping based on the distribution
> parameters of its directory.
> To my knowledge, you cannot set things on a per-file basis, as the
> distribution is done on a per-directory basis.
> If I'm wrong, everyone please feel free to flame me ;-)
>
> I'm not sure why you would want to base your stripe size on the size
> of the files, however, my guess is that you are trying to create a
> 'balanced load.'
> If this is the case, you can pick a generally good stripe size for the
> majority of your files, though the defaults seem to work well enough.
> Then pvfs2 will do the load balancing on its own.
> The way that pvfs2 does its striping is to have the server
> representing the 0th block of each file move in a random/round-robin
> order.
> So this spreads the file data load relatively evenly across all
> servers in the fs.
>
> The only things you should worry about here is picking a stripe size
> that is excessively small (< 4k?) as it will take with it a fairly
> large overhead/performance hit on the servers as well as the network.
> AND
> If you are worried about disk space, try not to set it too big.  I've
> used 4MB stripes before and not had problems if dealing with large
> files, however the 64KB default works pretty well.
> We can always try to help tune/tweak things for performance if you
> identify slowness.
>
> If you're trying to do something else with the stripe size
> modifications, feel free to ignore the previous comments.
>
> ~Kyle
>
> Kyle Schochenmaier
>
>
>
> On Fri, Jan 23, 2009 at 12:13 PM, FileSystem Novice
> <[email protected]> wrote:
> > Hi,
> >
> > While trying to explore the PVFS2 striping features using the 'setfattr'
> and
> > distribution parameters (like type of stripe, stripe-size, etc), the
> > examples I came across mostly deal with setting the striping distribution
> > parameters (simple-stripe, etc) on a directory basis. So if we set the
> > distribution parameters for a directory, all files created on that
> directory
> > (after the striping distribution parameters have been set) will be
> striped
> > identically.
> >
> > But let's say I want to stripe a file A in directory X using a
> simple-stripe
> > and strip length of 64KB and want to stripe a different file B in the
> same
> > directory X using a simple-stripe and strip length of 5KB, etc (since
> each
> > data file may have different record lengths, I would like to stripe them
> > based on a multiple of the record lenght)... would I achieve this by
> setting
> > each file's extended attributes with the desired values ?
> > Is it possible to do striping in PVFS2 on a per-file basis as mentioned
> > above and if so, are there links to any documents / examples out there?
> >
> > Thanks a lot for your help in advance,
> > Novice
> >
> >
> > _______________________________________________
> > Pvfs2-users mailing list
> > [email protected]
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to