Re: Serving files securely to web clients via mogile

Yoav Steinberg Thu, 08 Nov 2007 07:07:41 -0800

Mark Smith wrote:

I want to use mogile to serve files to remote clients via http. Ideally
the files should be accessed directly from the storage nodes by the
clients. Mogile seems good for this since all file access is via http,
and it works nicely with robust http servers (like lighty).


Problem is that I don't want the remote clients to see the actual
MogileFS file path when accessing the files and I want some security not
letting any client access any file. So instead of providing the client
with something like "http://myserver.com/dev1/0/000/000/000000001.fid";,
I want the client to access some name I generate (for example with
lighty's mod_secdownload).

Question is how would I go about configuring the storage nodes to do
this? Is there any such built in functionality in MogileFS?


That's not how MogileFS is traditionally used in a web environment.
The idea is that you run your MogileFS network internally and then
expose to the user something else that uses the MogileFS trackers to
translate your-names to internal-names.  The storage nodes are then
never exposed to the end users.

This setup gives you the advantage of controlling paths caching in
your application (you know best when certain items should expire, if
ever) as well as the flexibility to properly handle fallback in the
case of unavailable storage nodes.  It's also safer, you don't have to
plan your storage nodes to have separate upload and download
processes.

Anyway ... apparently we don't have "best practices" setup information
on the site or wiki... that's lame.  Well, typically MogileFS is
combined with Perlbal as the latter does most of the heavy lifting for
using MogileFS.  You still need your application servers to do a paths
lookup and decide on caching policies (if you want to enable path
caching), but you don't need to do any file serving there, Perlbal
will do it from the storage nodes.

I have a strong feeling that this is just going to be more confusing
than it was helpful, and I know our documentation is horrendous for
beginners, so please reply (to the list!) with questions and we'll get
you going.  :)

The problem I'm trying to avoid by serving the files directly from thestorage nodes is overloading some dedicated machines with the work of"proxying" data between the storage node and the end user. Also I don'twant the extra traffic on my local network. Path caching can still bedone by whoever provides the clients with the url's to the files. I caneasily avoid accessing the trackers too much by caching these paths, butat the point when a client wants to d/l the file accessing the storagenode directly seems like the most optimized solution to me.

One option is installing a second http server on each storage node, itwill do whatever translation or security I want and then access thelocal files upon request from the remote client. The remote client willknow what storage node to get to by talking with some web-app that usesthe trackers to find what storage nodes have the requested file, thisapp can also do path caching if required.

Alternatively I can do everything from the already existing web serveron each storage node by hacking my way through MogileFS sources to addsome file name security options when it configures the storage node'sweb server. If this seems logical and a better solution than two webservers running on each storage node, then I might actually do it(assuming some support from "the experts" will be available).

Finally I'd like to ask if there's any preference as to what web serverto run on the storage nodes (lighttpd/apache/perlbal), and what was theoriginal intention behind this flexibility?

Re: Serving files securely to web clients via mogile

Reply via email to