Geoffry,

Take a look at the RdfFileInputTool [1] in the rya.mapreduce module.  It
doesn't look like the shaded jar was uploaded to maven, so you will likely
need to build that artifact yourself by including the "-P mr" profile when
building Rya.

There are instructions for loading data with the RdfFileInputTool here [2],
but they appear to be out of date.  I haven't tried it recently, but this
command, based on the unit test [3] should work:

hadoop jar target/rya.mapreduce-3.2.12-shaded.jar
org.apache.rya.accumulo.mr.tools.RdfFileInputTool
-Dac.zk=zoo1,zoo2,zoo3 -Dac.instance=accumulo -Dac.username=root
-Dac.pwd=password -Dac.auth=auths -Dac.cv=auths -Drdf.tablePrefix=rya_
-Drdf.format=N-Triples /hdfs/path/to/triplefiles


[1]
https://github.com/apache/incubator-rya/blob/master/mapreduce/src/main/java/org/apache/rya/accumulo/mr/tools/RdfFileInputTool.java
[2]
https://github.com/apache/incubator-rya/blob/master/extras/rya.manual/src/site/markdown/loaddata.md
[3]
https://github.com/apache/incubator-rya/blob/master/mapreduce/src/test/java/org/apache/rya/accumulo/mr/tools/RdfFileInputToolTest.java



On Wed, Mar 14, 2018 at 5:28 PM, Geoffry Roberts <threadedb...@gmail.com>
wrote:

> All,
>
> Am I doing things the best way?
>
> I have a pile of data that I need to load into Rya.  I must first convert
> it into RDF, then do the load.  I am using map/reduce because I have a lot
> of data.
>
> I have an hdfs directory full of RDF in NTRIPLE format.
>
> I have a mapper like this:
>
> protected void map(LongWritable key, RyaStatementWritable value, Context
> ctx)
> {
>
> // RyaStatementWritable gives me a RyaStatement like this:
>
>
> RyaStatement ryaStatement = value.getRyaStatement();
>
>
> // At this point I find myself having to convert the
>
> // RyaStatement into an OpenRDF Statement like this:
>
>
> Sail ryaSail = RyaSailFactory.getInstance(conf);
>
> ValueFactory vf = ryaSail.getValueFactory();
>
> Statement stmt = vf.createStatement(vf.createURI(sS), vf.createURI(sP), vf
> .createURI(sO));
>
> ctx.write(null, stmt);
>
> }
>
> In my reducer, I use AccumuloLoadStatements to lood Rya like this:
>
> protected void reduce(NullWritable key, Iterable<Statement> stmts,
> Reducer<NullWritable, Statement, NullWritable, NullWritable>.Context ctx)
> throws IOException, InterruptedException {
>
> super.reduce(key, stmts, ctx);
>
>
> AccumuloLoadStatements load = ...omitted for brevity...
>
>
> try {
>
> load.loadStatements(instance, stmts);
>
> } catch (RyaClientException e) {
>
> log.error("", e);
>
> }
>
> }
>
>
> Thanks
>
> --
> There are ways and there are ways,
>
> Geoffry Roberts
>

Reply via email to