Doug Cutting wrote:

Hairong Kuang wrote:

Another option is to create a checksum file per block at the data node where
the block is placed.

Yes, but then we'd need a separate checksum implementation for intermediate data, and for other distributed filesystems that don't already guarantee end-to-end data integrity. Also, a checksum per block would not permit checksums on randomly accessed data without re-checksumming the entire block. Finally, the checksum wouldn't be end-to-end. We really want to checksum data as close to its source as possible, then validate that checksum as close to its use as possible.

I'm guessing the big impediment is lack of support in Java, but it seems like this would be good application for extended attributes/alternate forks/streams that so many file systems support these days.

JSR-203 ("NIO.2") adds multiple fork support and was approved more than three years ago:

http://jcp.org/en/jsr/detail?id=203

At the time it was slated for JDK 1.5, but then got deferred until Java 7. The story is tortured:

http://forums.java.net/jive/thread.jspa?threadID=298&messageID=12696
http://en.wikipedia.org/wiki/New_I/O

With the Open Sourcing of Java, it seems like the code for NIO.2 should be available now or soon though.

Ahh, here we go, Eclipse File System.

http://eclipsezone.com/eclipse/forums/t83786.html

This makes it sound like it might actually work:

http://wiki.eclipse.org/index.php/EFS#Local_file_system

Egad, there are more:

Extended Filesystem API (WebNFS)
http://docs.sun.com/app/docs/doc/806-1067/6jacl3e6g?a=view

NetBeans Filesystem API
http://www.netbeans.org/download/dev/javadoc/org-openide-filesystems/org/openide/filesystems/doc-files/api.html

Apache Commons VFS
http://jakarta.apache.org/commons/vfs/index.html

New I/O: Improved filesystem interface
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4313887

JSR 203: More New I/O APIs for the Java Platform ("NIO.2")
http://jcp.org/en/jsr/detail?id=203

IBM's AIO4 looks like a partial implementation, but focused on the asynchronous portion of the new API.

http://alphaworks.ibm.com/tech/aio4j

Somewhere in all that seems like there should be a nifty way to handle this. But I can see that sorting it out is a big job. What a mess.

*sigh*

Jim

Reply via email to