Curious if anyone has gone down the route of creating a fuse filesystem implementation around Memcache, kind of along the lines of CacheFS.

The problem with CacheFS, is your limited to the amount of memory on the given machine and the usual problems of cache consistency. MemcacheFS would extend this principle to work on exported/distributed filesystems and benefit from the distributed/fault tolerant design of Memcache. Essentially, it would act as a layer between a networked file system, storing small objects on demand (cache miss) and stat caches in in memory. Readdir() operations would fall back to disk.

Our specific application deals with millions of small xml files, 5-10k each. We don't care about readdir operations very much. Often the files are accessed directly from within an xslt processor that is not memcache aware, not to mention numerous other applications which cannot be easily modified to utilize Memcache natively. Further more, scaling Memcache is a lot easier than scaling many distributed filesystems; it's just to fire up another process and add it to the pool. Memcache has been ideally suited for enhancing the performance of database driven applications, why not apply it to another kind of database -- the filesystem?

I am curious what everyones' take is on this idea. Does this sound like a practical solution? What would be your critique against it or other concerns, permissions and security aside. Keep in mind, the underling filesystem is already network based, thus this would not significantly increase network IO. Clearly, I'm tailoring this to our specific needs of accessing small files. The solution would be dramatically more complex for storing large files and would probably require striping of some sort.


Best,

Erik Osterman

Reply via email to