Author: buildbot
Date: Tue Feb 17 14:45:54 2015
New Revision: 940471

Log:
Staging update by buildbot for jena

Modified:
    websites/staging/jena/trunk/content/   (props changed)
    websites/staging/jena/trunk/content/documentation/hadoop/mapred.html

Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Tue Feb 17 14:45:54 2015
@@ -1 +1 @@
-1660381
+1660394

Modified: websites/staging/jena/trunk/content/documentation/hadoop/mapred.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/hadoop/mapred.html 
(original)
+++ websites/staging/jena/trunk/content/documentation/hadoop/mapred.html Tue 
Feb 17 14:45:54 2015
@@ -191,7 +191,7 @@
 <p>Finally you may be interested in the usage of namespaces within your data, 
in this case the <code>TripleNamespaceCountMapper</code> or 
<code>QuadNamespaceCountMapper</code> can be used to do this.  For this use 
case you should use the <code>TextCountReducer</code> to total up the counts 
for each namespace.  Note that the mappers determine the namespace for a URI 
simply by splitting after the last <code>#</code> or <code>/</code> in the URI, 
if no such character exists then the full URI is considered to be the 
namespace.</p>
 <h2 id="filtering">Filtering</h2>
 <p>Filtering is another classic Map/Reduce use case, here you want to take the 
data and extract only the portions that you are interested in based on some 
criteria.  All our filter <code>Mapper</code> implementations also support a 
Job configuration option named <code>rdf.mapreduce.filter.invert</code> 
allowing their effects to be inverted if desired e.g.</p>
-<div class="codehilite"><pre><span class="n">config</span><span 
class="p">.</span><span class="n">setProperty</span><span 
class="p">(</span><span class="n">RdfMapReduceConstants</span><span 
class="p">.</span><span class="n">FILTER_INVERT</span><span class="p">,</span> 
<span class="n">true</span><span class="p">);</span>
+<div class="codehilite"><pre><span class="n">config</span><span 
class="p">.</span><span class="n">setBoolean</span><span 
class="p">(</span><span class="n">RdfMapReduceConstants</span><span 
class="p">.</span><span class="n">FILTER_INVERT</span><span class="p">,</span> 
<span class="n">true</span><span class="p">);</span>
 </pre></div>
 
 
@@ -208,12 +208,12 @@
 <p>In some cases you may only be interesting in triples/quads that are 
grounded i.e. don't contain blank nodes in which case the 
<code>GroundTripleFilterMapper</code> and <code>GroundQuadFilterMapper</code> 
can be used.</p>
 <h3 id="data-with-a-specific-uri">Data with a specific URI</h3>
 <p>In lots of case you may want to extract only data where a specific URI 
occurs in a specific position, for example if you wanted to extract all the 
<code>rdf:type</code> declarations then you might want to use the 
<code>TripleFilterByPredicateUriMapper</code> or 
<code>QuadFilterByPredicateUriMapper</code> as appropriate.  The job 
configuration option <code>rdf.mapreduce.filter.predicate.uris</code> is used 
to provide a comma separated list of the full URIs you want the filter to 
accept e.g.</p>
-<div class="codehilite"><pre><span class="n">config</span><span 
class="p">.</span><span class="n">setProperty</span><span 
class="p">(</span><span class="n">RdfMapReduceConstants</span><span 
class="p">.</span><span class="n">FILTER_PREDICATE_URIS</span><span 
class="p">,</span> &quot;<span class="n">http</span><span 
class="p">:</span><span class="o">//</span><span class="n">example</span><span 
class="p">.</span><span class="n">org</span><span class="o">/</span><span 
class="n">predicate</span><span class="p">,</span><span 
class="n">http</span><span class="p">:</span><span class="o">//</span><span 
class="n">another</span><span class="p">.</span><span class="n">org</span><span 
class="o">/</span><span class="n">predicate</span>&quot;<span 
class="p">);</span>
+<div class="codehilite"><pre><span class="n">config</span><span 
class="p">.</span><span class="n">setBoolean</span><span 
class="p">(</span><span class="n">RdfMapReduceConstants</span><span 
class="p">.</span><span class="n">FILTER_PREDICATE_URIS</span><span 
class="p">,</span> &quot;<span class="n">http</span><span 
class="p">:</span><span class="o">//</span><span class="n">example</span><span 
class="p">.</span><span class="n">org</span><span class="o">/</span><span 
class="n">predicate</span><span class="p">,</span><span 
class="n">http</span><span class="p">:</span><span class="o">//</span><span 
class="n">another</span><span class="p">.</span><span class="n">org</span><span 
class="o">/</span><span class="n">predicate</span>&quot;<span 
class="p">);</span>
 </pre></div>
 
 
 <p>Similar to the counting of node usage you can substitute 
<code>Predicate</code> for <code>Subject</code>, <code>Object</code> or 
<code>Graph</code> as desired.  You will also need to do this in the job 
configuration option, for example to filter on subject URIs in quads use the 
<code>QuadFilterBySubjectUriMapper</code> and the 
<code>rdf.mapreduce.filter.subject.uris</code> configuration option e.g.</p>
-<div class="codehilite"><pre><span class="n">config</span><span 
class="p">.</span><span class="n">setProperty</span><span 
class="p">(</span><span class="n">RdfMapReduceConstants</span><span 
class="p">.</span><span class="n">FILTER_SUBJECT_URIS</span><span 
class="p">,</span> &quot;<span class="n">http</span><span 
class="p">:</span><span class="o">//</span><span class="n">example</span><span 
class="p">.</span><span class="n">org</span><span class="o">/</span><span 
class="n">myInstance</span>&quot;<span class="p">);</span>
+<div class="codehilite"><pre><span class="n">config</span><span 
class="p">.</span><span class="n">setBoolean</span><span 
class="p">(</span><span class="n">RdfMapReduceConstants</span><span 
class="p">.</span><span class="n">FILTER_SUBJECT_URIS</span><span 
class="p">,</span> &quot;<span class="n">http</span><span 
class="p">:</span><span class="o">//</span><span class="n">example</span><span 
class="p">.</span><span class="n">org</span><span class="o">/</span><span 
class="n">myInstance</span>&quot;<span class="p">);</span>
 </pre></div>
 
 


Reply via email to