Adam: yes, we serve up the file contents, not the URL to the media. lighthttpd makes this simple with the X-SendFile header.
Having not used varnish or squid before, do either support some form of distributed memory caches so that rather than buying a single expensive box with tons of memory, we can aggregate the free memory of a bunch of less expensive boxes, as with memcached? Jay dormando wrote: > You could put something like varnish inbetween that final step and your > client.. > > so key is pulled in, file is looked up, then file is fetched *through* > varnish. Of course I don't know offhand how much work it would be to make > your app deal with that fetch-through scenario. > > Since these files are large memcached probably isn't the best bet for > this. > > On Mon, 2 Nov 2009, Jay Paroline wrote: > > > > > I'm not sure how well a reverse proxy would fit our needs, having > > never used one before. The way we do streaming is a client sends a one- > > time-use key to the stream server. The key is used to determine which > > file should be streamed, and then the file is returned. The effect is > > that no two requests are identical, and that code must be run for > > every single request to verify the request and lookup the appropriate > > file. Is it possible or practical to use a reverse proxy in that way? > > > > Jay > > > > Adam Lee wrote: > > > I'm guessing you might get better mileage out of using something written > > > more for this purpose, e.g. squid set up as a reverse proxy. > > > > > > On Mon, Nov 2, 2009 at 4:35 PM, Jay Paroline <[email protected]> wrote: > > > > > > > > > > > I'm running this by you guys to make sure we're not trying something > > > > completely insane. ;) > > > > > > > > We already rely on memcached quite heavily to minimize load on our DB > > > > with stunning success, but as a music streaming service, we also serve > > > > up lots and lots of 5-6MB files, and right now we don't have a > > > > distributed cache of any kind, just lots and lots of really fast > > > > disks. Due to the nature of our content, we have some files that are > > > > insanely popular, and a lot of long tail content that gets played > > > > infrequently. I don't remember the exact numbers, but I'd guesstimate > > > > that the top 50GB of our many TB of files accounts for 40-60% of our > > > > streams on any given day. > > > > > > > > What I'd love to do is get those popular files served from memory, > > > > which should alleviate load on the disks considerably. Obviously the > > > > file system cache does some of this already, but since it's not > > > > distributed it uses the space a lot less efficiently than a > > > > distributed cache would (say one popular file lives on 3 stream nodes, > > > > it's going to be cached in memory 3 separate times instead of just > > > > once). We have multiple stream servers, obviously, and between them > > > > we could probably scrounge up 50GB or more for memcached, > > > > theoretically removing the disk load for all of the most popular > > > > content. > > > > > > > > My favorite memory cache is of course memcache, so I'm wondering if > > > > this would be an appropriate use (with the slab size turned way up, > > > > obviously). We're going to start doing some experiments with it, but > > > > I'm wondering what the community thinks. > > > > > > > > Thanks, > > > > > > > > Jay > > > > > > > > > > > > > > > > -- > > > awl > >
