So I am trying to create my own rFile and write it to accumulo...
in the nutshell.
I create my rFile and two directories. One that would contain the file and
one for failures, both required by importDirectory
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://blah:9000/");
conf.set("fs.hdfs.impl",
"org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = FileSystem.get(conf);
Path input = new Path("/accumulo/temp1/testing/");
Path output = new Path("/accumulo/temp1/testing/my_output");
fs.mkdirs(input);
fs.mkdirs(output);
String extension = conf.get(FILE_TYPE);
if (extension == null || extension.isEmpty()) {
extension = RFile.EXTENSION;
}
String filename = "/accumulo/temp1/testing/my_input/testFile." +
extension;
Path file = new Path(filename);
if (fs.exists(file)) {
file.getFileSystem(conf).delete(file, false);
}
FileSKVWriter out =
RFileOperations.getInstance().openWriter(filename, fs, conf,
AccumuloConfiguration.getDefaultConfiguration());
out.startDefaultLocalityGroup();
long timestamp = (new Date()).getTime();
Key key = new Key(new Text("row_1"), new Text("cf"), new Text("cq"),
new ColumnVisibility(), timestamp);
Value value = new Value("".getBytes());
out.append(key, value);
out.close();
at this point i can ssh into my namenode and see the file and two
directories
then i try to bulk import it
String instanceName = "blah";
String zooServers = "blah:2181,blah:2181"
String userName = ; // Provide username
String password = ; // Provide password
// Connect
Instance inst = new ZooKeeperInstance(instanceName, zooServers);
Connector conn = inst.getConnector(userName, password);
TableOperations ops = conn.tableOperations();
ops.delete("mynewtesttable");
ops.create("mynewtesttable");
ops.importDirectory("mynewtesttable", input.toString(),
output.toString(), false);
The exception that I am getting is
SEVERE: null
org.apache.accumulo.core.client.AccumuloException: Bulk import directory
/accumulo/temp1/testing does not exist!
I tried to play around with the file/directory owner by manually setting it
to accumulo and then hadoop, but no luck.
I checked hdfs-site and I have
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
Any ideas?
Any guesses of what might be wrong?