Re: Running Accumulo on a standard file system, without Hadoop

Josh Elser Mon, 16 Jan 2017 13:56:57 -0800

That's true, but HDFS supports multiple "implementations" based on thescheme of the URI being used.


e.g. hdfs:// is mapped to DistributedFileSystem

You can configure HDFS to use the RawLocalFileSystem class for file://URIs which is what is done for a majority of the integration tests.Beware that you configure the RawLocalFileSystem as theChecksumFileSystem (default for file://) will fail miserably around WALrecovery.


https://github.com/apache/accumulo/blob/master/test/src/main/java/org/apache/accumulo/test/BulkImportVolumeIT.java#L61

Dave Marion wrote:

IIRC, Accumulo *only* uses the HDFS client, so it needs something on the other 
side that can respond to that protocol. MiniAccumulo starts up MiniHDFS for 
this. You could run some other type of service locally that is HDFS client 
compatible (something like Quantcast QFS[1], setting up client [2]). If 
Accumulo is using something in Hadoop outside of the public client API, this 
may not work.

[1] https://github.com/quantcast/qfs
[2] https://github.com/quantcast/qfs/wiki/Migration-Guide

-----Original Message-----
From: Dylan Hutchison [mailto:[email protected]]
Sent: Monday, January 16, 2017 3:17 PM
To: [email protected]
Subject: Running Accumulo on a standard file system, without Hadoop

Hi folks,

A friend of mine asked about running Accumulo on a normal file system in
place of Hadoop, similar to the way MiniAccumulo runs.  How possible is this,
or how much work would it take to do so?

I think my friend is just interested in running on a single node, but I am
curious about both the single-node and distributed (via parallel file system
like Lustre) cases.

Thanks, Dylan

Re: Running Accumulo on a standard file system, without Hadoop

Reply via email to