Geoffry, Take a look at the RdfFileInputTool [1] in the rya.mapreduce module. It doesn't look like the shaded jar was uploaded to maven, so you will likely need to build that artifact yourself by including the "-P mr" profile when building Rya.
There are instructions for loading data with the RdfFileInputTool here [2], but they appear to be out of date. I haven't tried it recently, but this command, based on the unit test [3] should work: hadoop jar target/rya.mapreduce-3.2.12-shaded.jar org.apache.rya.accumulo.mr.tools.RdfFileInputTool -Dac.zk=zoo1,zoo2,zoo3 -Dac.instance=accumulo -Dac.username=root -Dac.pwd=password -Dac.auth=auths -Dac.cv=auths -Drdf.tablePrefix=rya_ -Drdf.format=N-Triples /hdfs/path/to/triplefiles [1] https://github.com/apache/incubator-rya/blob/master/mapreduce/src/main/java/org/apache/rya/accumulo/mr/tools/RdfFileInputTool.java [2] https://github.com/apache/incubator-rya/blob/master/extras/rya.manual/src/site/markdown/loaddata.md [3] https://github.com/apache/incubator-rya/blob/master/mapreduce/src/test/java/org/apache/rya/accumulo/mr/tools/RdfFileInputToolTest.java On Wed, Mar 14, 2018 at 5:28 PM, Geoffry Roberts <threadedb...@gmail.com> wrote: > All, > > Am I doing things the best way? > > I have a pile of data that I need to load into Rya. I must first convert > it into RDF, then do the load. I am using map/reduce because I have a lot > of data. > > I have an hdfs directory full of RDF in NTRIPLE format. > > I have a mapper like this: > > protected void map(LongWritable key, RyaStatementWritable value, Context > ctx) > { > > // RyaStatementWritable gives me a RyaStatement like this: > > > RyaStatement ryaStatement = value.getRyaStatement(); > > > // At this point I find myself having to convert the > > // RyaStatement into an OpenRDF Statement like this: > > > Sail ryaSail = RyaSailFactory.getInstance(conf); > > ValueFactory vf = ryaSail.getValueFactory(); > > Statement stmt = vf.createStatement(vf.createURI(sS), vf.createURI(sP), vf > .createURI(sO)); > > ctx.write(null, stmt); > > } > > In my reducer, I use AccumuloLoadStatements to lood Rya like this: > > protected void reduce(NullWritable key, Iterable<Statement> stmts, > Reducer<NullWritable, Statement, NullWritable, NullWritable>.Context ctx) > throws IOException, InterruptedException { > > super.reduce(key, stmts, ctx); > > > AccumuloLoadStatements load = ...omitted for brevity... > > > try { > > load.loadStatements(instance, stmts); > > } catch (RyaClientException e) { > > log.error("", e); > > } > > } > > > Thanks > > -- > There are ways and there are ways, > > Geoffry Roberts >