On Oct 6, 2006, at 5:59 PM, Pete Wyckoff wrote:
[EMAIL PROTECTED] wrote on Fri, 06 Oct 2006 16:33 -0500:
On Oct 6, 2006, at 1:48 PM, Julian Martin Kunkel wrote:
Also it will not
allow to set the servers for all distributions...
Yeah I can't imagine wanting to ever do that. It would mean passing
in a distribution different from the default simple-stripe, as well
as a hint saying you want a specific set of servers in the same
call. Seems sort of yucky to me. I'd rather have all the
information about the distribution in the distribution. You're even
able to use the distribution field in the directory hints structure
to specify per-directory IO server lists. Not that you would ever
want to do that either...
I agree with Sam that this is yucky. I'm hijacking this thread.
Let's forget about hints for a moment
I think everyone is already aware of this, but just to clarify terms,
I mentioned the 'directory hints' previously, but these aren't the
same hints that Julian has added to his hints branch. Murali added
these extended attributes:
user.pvfs2.dist_name
user.pvfs2.dist_params
user.pvfs2.num_dfiles
user.pvfs2.meta_hint
These are used to modify the behavior of files created in that
directory. In terms of functionality, if we were to provide a
distribution that allows us to enumerate the actual servers, or just
specify a number of servers (and let them be chosen randomly), then
the num_dfiles eattr becomes redundant. This is a case though where
one might want to be able to store the IO servers list in the
distribution (or again just a count of them).
and decide how we want to
extend the concept of distributions, as seen by users, in such a way
that they can specify particular IO servers by name. If this is an
interface people want, we should design it properly, not just
implement it with hints because we (might) have them.
Some issues, please suggest approaches and other issues. (I'm using
"name" here to mean host alias.)
1. What kind of control do users want?
- all data on one server by name?
- arbirtrary control of stripe sizes and host names?
2. New distribution name, or extension to existing ones?
- dist-varstrip has a lot of flexibility, but no hostnames
- maybe a new "dist-single-host-by-name" is all that is desired
3. Store hostnames in on-disk distribution?
- guessing no for the single-stripe distro, but perhaps somebody
can really think of a use case for this?
4. User API
- through PVFS_dist_create
- (please not through both PVFS_dist_create + some hint)
- via environment variable too?
I would argue that environment variables are messy for this. We
already have a precedent for using extended attributes to group the
way new files get distributed, and I would imagine eattrs give you
some interesting capabilities when it comes to doing migration.
If our design happens to end up as something that would be
implemented well by hints, then we can think about using them. For
now, let's just get the design correct.
We can come back and argue the merits of a generic hint interface in
a different thread.
-- Pete
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers