If you have control over the reproxy, you can do a simple hash against the list of all machines you have. Varnish/squid can also do internal forwards after hashing.
It's a little weird but varnish affords you a lot of smarts from the serving end. Another thing worth noting I guess, is that typically for mogilefs (getting even more off topic...) files are usually referred from one server at a time. So once it gets into the filesystem cache on that server, sendfile(2) from lighttpd or nginx or whatever is very fast. If you're rounding through a bunch of servers or don't have enough RAM on your storage nodes that isn't as helpful. On Mon, 2 Nov 2009, Jay Paroline wrote: > > Adam: yes, we serve up the file contents, not the URL to the media. > lighthttpd makes this simple with the X-SendFile header. > > Having not used varnish or squid before, do either support some form > of distributed memory caches so that rather than buying a single > expensive box with tons of memory, we can aggregate the free memory of > a bunch of less expensive boxes, as with memcached? > > Jay > > dormando wrote: > > You could put something like varnish inbetween that final step and your > > client.. > > > > so key is pulled in, file is looked up, then file is fetched *through* > > varnish. Of course I don't know offhand how much work it would be to make > > your app deal with that fetch-through scenario. > > > > Since these files are large memcached probably isn't the best bet for > > this. > > > > On Mon, 2 Nov 2009, Jay Paroline wrote: > > > > > > > > I'm not sure how well a reverse proxy would fit our needs, having > > > never used one before. The way we do streaming is a client sends a one- > > > time-use key to the stream server. The key is used to determine which > > > file should be streamed, and then the file is returned. The effect is > > > that no two requests are identical, and that code must be run for > > > every single request to verify the request and lookup the appropriate > > > file. Is it possible or practical to use a reverse proxy in that way? > > > > > > Jay > > > > > > Adam Lee wrote: > > > > I'm guessing you might get better mileage out of using something written > > > > more for this purpose, e.g. squid set up as a reverse proxy. > > > > > > > > On Mon, Nov 2, 2009 at 4:35 PM, Jay Paroline <[email protected]> > > > > wrote: > > > > > > > > > > > > > > I'm running this by you guys to make sure we're not trying something > > > > > completely insane. ;) > > > > > > > > > > We already rely on memcached quite heavily to minimize load on our DB > > > > > with stunning success, but as a music streaming service, we also serve > > > > > up lots and lots of 5-6MB files, and right now we don't have a > > > > > distributed cache of any kind, just lots and lots of really fast > > > > > disks. Due to the nature of our content, we have some files that are > > > > > insanely popular, and a lot of long tail content that gets played > > > > > infrequently. I don't remember the exact numbers, but I'd guesstimate > > > > > that the top 50GB of our many TB of files accounts for 40-60% of our > > > > > streams on any given day. > > > > > > > > > > What I'd love to do is get those popular files served from memory, > > > > > which should alleviate load on the disks considerably. Obviously the > > > > > file system cache does some of this already, but since it's not > > > > > distributed it uses the space a lot less efficiently than a > > > > > distributed cache would (say one popular file lives on 3 stream nodes, > > > > > it's going to be cached in memory 3 separate times instead of just > > > > > once). We have multiple stream servers, obviously, and between them > > > > > we could probably scrounge up 50GB or more for memcached, > > > > > theoretically removing the disk load for all of the most popular > > > > > content. > > > > > > > > > > My favorite memory cache is of course memcache, so I'm wondering if > > > > > this would be an appropriate use (with the slab size turned way up, > > > > > obviously). We're going to start doing some experiments with it, but > > > > > I'm wondering what the community thinks. > > > > > > > > > > Thanks, > > > > > > > > > > Jay > > > > > > > > > > > > > > > > > > > > > -- > > > > awl > > > >
