On Tue, Nov 6, 2012 at 8:49 AM, Andrew Deason <[email protected]>wrote:

> On Tue, 6 Nov 2012 00:06:53 -0800
> Timothy Balcer <[email protected]> wrote:
>
> > I have a need to think about replicating large volumes (multigigabyte)
> > of large number (many terabytes of data total), to at least two other
> > servers besides the read write volume, and to perform these releases
> > relatively frequently (much more than once a day, preferably)
>
> How much more frequently? Hourly? Some people do 4 times hourly (and
> maybe more) successfully.
>

Well, unless I am missing something seriously obvious, for example it took
1.5 hours to rsync a subdirectory to an AFS volume that had not a lot of
content, but many directories.

How frequently depends on use, and being able to release faster than the
writes.. I don't have performance data on the writes yet, but that will
change anyway.. we are going from 200+ clients to many more. Which is why I
am working with AFS in the first place for this.. the environment is a
write once, read many situation.

>
> > Also, these other two (or more) read-only volumes for each read write
> > volume will be remote volumes, transiting across relatively fat, but
> > less than gigabit, pipes (100+ megabits)
>
> Latency may matter more than bandwidth; do you know what it is?
>

depending on the colo site, between 30 and 60ms


>
> > For the moment what I have decided to experiment with is a simple
> > system.  My initial idea is to work the afs read-only volume tree into
> > an AUFS union, with a local read write partition in the mix. This way,
> > writes will be local, but I can periodically "flush" writes to the AFS
> > tree, double check they have been written and released, and then
> > remove them from the local partition.. this should maintain integrity
> > and high availability for the up-to-the-moment recordings, given I
> > RAID the local volume. Obviously, this still introduces a single point
> > of failure... so I'd like to flush as frequently as possible.
> > Incidentally, it seems you can NFS export such a union system fairly
> > simply.
>
> I'm not sure I understand the purpose of this; are you trying to write
> new data from all of the 'remote' locations, and you need those writes
> to 'finish' quickly?
>

No, I am writing from a local audio/video server to a local repo, which
needs to be very fast in order to service live streaming in parallel with
write on a case by case basis.

That local repo would be in a R/W branch above the AFS R/O branch, so:

 dirs=/Read-Write=rw:/afs/path/to/read-only=ro aufs /union

This way I can present the /union to the application server as a read/write
repo for all its needs, including archival use, but still have AFS
underneath for replication and distribution.

*sigh*

I wish OSD was primetime :)


> > But, I feel as if I am missing something... it has become clear that
> > releasing is a pretty intensive operation, and if we're talking about
> > multiple gigabytes per release, I can imagine it being extremely
> > difficult.  Is there a schema that i can use with OpenAFS that will
> > help alleviate this problem? Or perhaps another approach I am missing
> > that may solve it better?
>
> Eh, some people do that; it just reduces the benefit of the client-side
> caching. Every time you release a volume, the server tells clients that
> for all data in that volume, the client needs to check with the server
> to see if the cached data is different from what's actually in the
> volume. But that may not matter so much, especially for a small number
> of large files.
>

Well thats the thing.. this is a large number of small to medium sized
files that are being written continuously. In addition, there is a quite
deep directory structure. I'm trying to get it flattened out to improve
scaling, but at the moment it is taking 1.5 hours to rsync a subdirectory
containing about 5G of data, but  23,681 directories, for example

releasing is a whole 'nuther animal... ;-)


> To improve things, you can maybe try to reduce the number of volumes
> that are changing. That is, if you are adding new data in batches, I
> don't know if it's feasible for you to add that 'batch' of data by
> creating a new volume instead of writing to existing volumes.
>

That's feasible..... but what if, for example, vol1 is mounted at *
/afs/foo/home/bar* and contains a thousand directories. The new content is
a thousand more directories, but at the exact same level of the tree. How
would I handle that? As far as I can tell, OpenAFS only allows a volume
being mounted on its very own directory, and you can't nest them together
like that.

How unfeasible would it be to create N volumes, where N >= 500 per shot? I
would end up with many thousands of tiny volumes.. none of which I have
trouble with, but would that be scalable? Let's assume I have spread out db
and file servers in such a way to equalize load.


>
>
> And, of course, the release process may not be fast enough to actually
> do releases as quickly as you want. There are maybe some ways to ship
> around volume dumps yourself to get around that, and some pending
> improvements to the volserver that would help, but I would only think
> about that after you try the releases yourself.
>

The idea of doing R/W "checkpoint" volumes that I only have to release once
in a while after the first release is very appealing... if you can suggest
a solution to the problem above.. I am all ears!! :) I would be VERY happy
to be able to allocate space, quota and location, on the fly, in batchwise
operations.


> --
> Andrew Deason
> [email protected]
>
> _______________________________________________
> OpenAFS-info mailing list
> [email protected]
> https://lists.openafs.org/mailman/listinfo/openafs-info
>



-- 
Timothy Balcer

Reply via email to