That's true, but HDFS supports multiple "implementations" based on the
scheme of the URI being used.
e.g. hdfs:// is mapped to DistributedFileSystem
You can configure HDFS to use the RawLocalFileSystem class for file://
URIs which is what is done for a majority of the integration tests.
Beware that you configure the RawLocalFileSystem as the
ChecksumFileSystem (default for file://) will fail miserably around WAL
recovery.
https://github.com/apache/accumulo/blob/master/test/src/main/java/org/apache/accumulo/test/BulkImportVolumeIT.java#L61
Dave Marion wrote:
IIRC, Accumulo *only* uses the HDFS client, so it needs something on the other
side that can respond to that protocol. MiniAccumulo starts up MiniHDFS for
this. You could run some other type of service locally that is HDFS client
compatible (something like Quantcast QFS[1], setting up client [2]). If
Accumulo is using something in Hadoop outside of the public client API, this
may not work.
[1] https://github.com/quantcast/qfs
[2] https://github.com/quantcast/qfs/wiki/Migration-Guide
-----Original Message-----
From: Dylan Hutchison [mailto:[email protected]]
Sent: Monday, January 16, 2017 3:17 PM
To: [email protected]
Subject: Running Accumulo on a standard file system, without Hadoop
Hi folks,
A friend of mine asked about running Accumulo on a normal file system in
place of Hadoop, similar to the way MiniAccumulo runs. How possible is this,
or how much work would it take to do so?
I think my friend is just interested in running on a single node, but I am
curious about both the single-node and distributed (via parallel file system
like Lustre) cases.
Thanks, Dylan