Hi, (I only answer questions publicly, so this goes back to the Jackrabbit list)
On Mon, Feb 16, 2009 at 3:39 PM, imadhusudhanan <[email protected]> wrote: > Currently we use Hadoop API to access files from Distributed File > System. I would like to enable Webdav to the same DFS that I use using JR. > May I know how I can make it possible ?? Jackrabbit is not a generic WebDAV to file system mapper as you might think. Since it is a JCR repository, it must allow for all the fine-grained JCR features (nodes and residual properties, versioning, node types, locking etc.) that cannot be mapped onto simple OS file systems or simple filesystem abstractions (as what Hadoop contains for example). Theoretically that might be possible, but it's not an option for a performant implementation. Therefore Jackabbit has its own persistence abstraction (mainly around the PersistenceManager interface [1]), which is driven by the internal architecture to support the full JCR API. [1] http://jackrabbit.apache.org/api/1.5/org/apache/jackrabbit/core/persistence/PersistenceManager.html > Also Hadoop DFS has its own FileSystem. I guess that an entry in > repository.xml <FileSystem> tag will change the file system to what ever I > specify say the org.apache.hadoop.fs.LocalFileSystem etc. No, you cannot use it. FileSystem is just a common name for persistence abstractions, but in this case, Hadoop's FileSystem (base class org.apache.hadoop.fs.FileSystem) and Jackrabbit's FileSystem (interface org.apache.jackrabbit.core.fs.FileSystem) are two completely different things. Also, Jackrabbit's FileSystem is somewhat deprecated and today not used for actual persistence - that's handled by PersistenceManagers which are at a low-level where they no longer "know" about the hierarchy, but solely work with uuids and node bundles. This means writing a PersistenceManager that works with a Hadoop FileSystem is probably very difficult or even impossible. Not sure how Marcel's implementation works, but it seems to use a different Hadoop API (not the Filesystem). There are two options that might work for you, but both involve some coding effort: one is to use Jackrabbit's WebDAV server "library" to build your own server-side implementation of WebDAV that connects to a Hadoop FileSystem. The other option would be to implement the full JCR API via Jackrabbit SPI [2], which is a simpler API than the full JCR API, and build this on top of a Hadoop FileSystem - but this is a rather huge effort. http://jackrabbit.apache.org/jackrabbit-spi.html Have a look at the following links if you are interested in more informations about Jackrabbit's architecture: http://jackrabbit.apache.org/jackrabbit-architecture.html http://jackrabbit.apache.org/how-jackrabbit-works.html http://jackrabbit.apache.org/jackrabbit-configuration.html Regards, Alex -- Alexander Klimetschek [email protected]
