Author: nick
Date: Sat Dec 20 07:12:41 2014
New Revision: 1646921
URL: http://svn.apache.org/r1646921
Log:
Include more examples in the website TIKA-1390
Modified:
tika/site/publish/1.7/examples.html
tika/site/src/site/apt/1.7/examples.apt
Modified: tika/site/publish/1.7/examples.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.7/examples.html?rev=1646921&r1=1646920&r2=1646921&view=diff
==============================================================================
--- tika/site/publish/1.7/examples.html (original)
+++ tika/site/publish/1.7/examples.html Sat Dec 20 07:12:41 2014
@@ -87,13 +87,16 @@
<!-- Licensed to the Apache Software Foundation (ASF) under one or
more --><!-- contributor license agreements. See the NOTICE file distributed
with --><!-- this work for additional information regarding copyright
ownership. --><!-- The ASF licenses this file to You under the Apache License,
Version 2.0 --><!-- (the "License"); you may not use this file except in
compliance with --><!-- the License. You may obtain a copy of the License at
--><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!--
Unless required by applicable law or agreed to in writing, software --><!--
distributed under the License is distributed on an "AS IS" BASIS, --><!--
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
--><!-- See the License for the specific language governing permissions and
--><!-- limitations under the License. --><div class="section">
<h2>Apache Tika API Usage Examples<a
name="Apache_Tika_API_Usage_Examples"></a></h2>
<p>This page provides a number of examples on how to use the various Tika
APIs. All of the examples shown are also available in the <a
class="externalLink"
href="https://svn.apache.org/repos/asf/tika/trunk/tika-example">Tika Example
module</a> in SVN.</p>
-<p>TODO Complete</p>
<ul>
<li><a href="#Apache_Tika_API_Usage_Examples">Apache Tika API Usage
Examples</a>
<ul>
<li><a href="#Parsing">Parsing</a>
<ul>
-<li><a href="#Parsing_using_the_Tika_Facade">Parsing using the Tika
Facade</a></li></ul></li>
+<li><a href="#Parsing_using_the_Tika_Facade">Parsing using the Tika
Facade</a></li>
+<li><a href="#Parsing_using_the_Auto-Detect_Parser">Parsing using the
Auto-Detect Parser</a></li></ul></li>
+<li><a href="#Picking_different_output_types">Picking different output
types</a>
+<ul>
+<li><a href="#Parsing_to_Plain_Text">Parsing to Plain Text</a></li></ul></li>
<li><a href="#Custom_Content_Handlers">Custom Content Handlers</a>
<ul>
<li><a href="#Extract_Phone_Numbers_from_Content_into_the_Metadata">Extract
Phone Numbers from Content into the Metadata</a></li></ul></li>
@@ -102,25 +105,34 @@
<li><a href="#Translation_using_the_Microsoft_Translation_API">Translation
using the Microsoft Translation API</a></li></ul></li></ul></li></ul>
<div class="section">
<h3><a name="Parsing">Parsing</a></h3>
-<p>TODO Explain the options</p>
+<p>Tika provides a number of different ways to trigger the parsing of a file.
These provide different levels of control and flexibility, with varying levels
of complexity to trigger.</p>
<div class="section">
<h4><a name="Parsing_using_the_Tika_Facade">Parsing using the Tika
Facade</a></h4>
-<p>TODO Explain about using this</p><style type="text/css">
+<p>The <a href="./apidocs/org/apache/tika/Tika.html">Tika facade</a>, provides
a number of very quick and easy ways to have your content parsed by Tika, and
return the resulting plain text</p><style type="text/css">
@import url('attached-includes/css/shCoreDefault.css');
</style>
-<div id="highlighter_882359" class="syntaxhighlighter nogutter java"><table
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div
class="container"><div class="line number37 index0 alt2"><code class="java
keyword">public</code> <code class="java plain">String parseToStringExample()
</code><code class="java keyword">throws</code> <code class="java
plain">IOException, SAXException, TikaException {</code></div><div class="line
number38 index1 alt1"><code class="java
spaces"> </code><code class="java plain">InputStream
stream = ParsingExample.</code><code class="java keyword">class</code><code
class="java plain">.getResourceAsStream(</code><code class="java
string">"test.doc"</code><code class="java plain">);</code></div><div
class="line number39 index2 alt2"><code class="java
spaces"> </code><code class="java plain">Tika tika =
</code><code class="java keyword">new</code> <code class="java
plain">Tika();</code></div><
div class="line number40 index3 alt1"><code class="java
spaces"> </code><code class="java keyword">try</code>
<code class="java plain">{</code></div><div class="line number41 index4
alt2"><code class="java
spaces"> </code><code
class="java keyword">return</code> <code class="java
plain">tika.parseToString(stream);</code></div><div class="line number42 index5
alt1"><code class="java spaces"> </code><code
class="java plain">} </code><code class="java keyword">finally</code> <code
class="java plain">{</code></div><div class="line number43 index6 alt2"><code
class="java
spaces"> </code><code
class="java plain">stream.close();</code></div><div class="line number44 index7
alt1"><code class="java spaces"> </code><code
class="java plain">}</code></div><div class="line number45 index8 alt2"><code
class="java plain">}</code></div></
div></td></tr></tbody></table></div></div></div>
+<div id="highlighter_418999" class="syntaxhighlighter nogutter java"><table
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div
class="container"><div class="line number37 index0 alt2"><code class="java
keyword">public</code> <code class="java plain">String parseToStringExample()
</code><code class="java keyword">throws</code> <code class="java
plain">IOException, SAXException, TikaException {</code></div><div class="line
number38 index1 alt1"><code class="java
spaces"> </code><code class="java plain">InputStream
stream = ParsingExample.</code><code class="java keyword">class</code><code
class="java plain">.getResourceAsStream(</code><code class="java
string">"test.doc"</code><code class="java plain">);</code></div><div
class="line number39 index2 alt2"><code class="java
spaces"> </code><code class="java plain">Tika tika =
</code><code class="java keyword">new</code> <code class="java
plain">Tika();</code></div><
div class="line number40 index3 alt1"><code class="java
spaces"> </code><code class="java keyword">try</code>
<code class="java plain">{</code></div><div class="line number41 index4
alt2"><code class="java
spaces"> </code><code
class="java keyword">return</code> <code class="java
plain">tika.parseToString(stream);</code></div><div class="line number42 index5
alt1"><code class="java spaces"> </code><code
class="java plain">} </code><code class="java keyword">finally</code> <code
class="java plain">{</code></div><div class="line number43 index6 alt2"><code
class="java
spaces"> </code><code
class="java plain">stream.close();</code></div><div class="line number44 index7
alt1"><code class="java spaces"> </code><code
class="java plain">}</code></div><div class="line number45 index8 alt2"><code
class="java plain">}</code></div></
div></td></tr></tbody></table></div></div>
+<div class="section">
+<h4><a name="Parsing_using_the_Auto-Detect_Parser">Parsing using the
Auto-Detect Parser</a></h4>
+<p>For more control, you can call the <a
href="./apidocs/org/apache/tika/parser/Parser.html">Tika Parsers</a> directly.
Most likely, you'll want to start out using the <a
href="./apidocs/org/apache/tika/parser/AutoDetectParser.html">Auto-Detect
Parser</a>, which works out what kind of content you have, then finds an
appropriate parser to call for you.</p><div id="highlighter_633677"
class="syntaxhighlighter nogutter java"><table border="0" cellpadding="0"
cellspacing="0"><tbody><tr><td class="code"><div class="container"><div
class="line number66 index0 alt1"><code class="java keyword">public</code>
<code class="java plain">String parseExample() </code><code class="java
keyword">throws</code> <code class="java plain">IOException, SAXException,
TikaException {</code></div><div class="line number67 index1 alt2"><code
class="java spaces"> </code><code class="java
plain">InputStream stream = ParsingExample.</code><code class="java
keyword">class</code><code class
="java plain">.getResourceAsStream(</code><code class="java
string">"test.doc"</code><code class="java plain">);</code></div><div
class="line number68 index2 alt1"><code class="java
spaces"> </code><code class="java
plain">AutoDetectParser parser = </code><code class="java keyword">new</code>
<code class="java plain">AutoDetectParser();</code></div><div class="line
number69 index3 alt2"><code class="java
spaces"> </code><code class="java
plain">BodyContentHandler handler = </code><code class="java
keyword">new</code> <code class="java
plain">BodyContentHandler();</code></div><div class="line number70 index4
alt1"><code class="java spaces"> </code><code
class="java plain">Metadata metadata = </code><code class="java
keyword">new</code> <code class="java plain">Metadata();</code></div><div
class="line number71 index5 alt2"><code class="java
spaces"> </code><code class="java keyword">try</code>
<code class="java plain">{</code></div><div class="line number72 index6
alt1"><code class="java
spaces"> </code><code
class="java plain">parser.parse(stream, handler, metadata);</code></div><div
class="line number73 index7 alt2"><code class="java
spaces"> </code><code
class="java keyword">return</code> <code class="java
plain">handler.toString();</code></div><div class="line number74 index8
alt1"><code class="java spaces"> </code><code
class="java plain">} </code><code class="java keyword">finally</code> <code
class="java plain">{</code></div><div class="line number75 index9 alt2"><code
class="java
spaces"> </code><code
class="java plain">stream.close();</code></div><div class="line number76
index10 alt1"><code class="java spaces"> </code><code
class="java plain">}</code></div><div class="line numbe
r77 index11 alt2"><code class="java
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<div class="section">
+<h3><a name="Picking_different_output_types">Picking different output
types</a></h3>
+<p>With Tika, you can get the textual content of your files returned in a
number of different formats. These can be plain text, html, xhtml, xhtml of one
part of the file etc. This is controlled based on the <a class="externalLink"
href="http://docs.oracle.com/javase/7/docs/api/org/xml/sax/ContentHandler.html">ContentHandler</a>
you supply to the Parser.</p>
+<div class="section">
+<h4><a name="Parsing_to_Plain_Text">Parsing to Plain Text</a></h4>
+<p>TODO</p></div></div>
<div class="section">
<h3><a name="Custom_Content_Handlers">Custom Content Handlers</a></h3>
<p>The textual output of parsing a file with Tika is returned via the SAX <a
class="externalLink"
href="http://docs.oracle.com/javase/7/docs/api/org/xml/sax/ContentHandler.html">ContentHandler</a>
you pass to the parse method. It is possible to customise your parsing by
supplying your own ContentHandler which does special things.</p>
<div class="section">
<h4><a name="Extract_Phone_Numbers_from_Content_into_the_Metadata">Extract
Phone Numbers from Content into the Metadata</a></h4>
-<p>By using the <a
href="./apidocs/org/apache/tika/sax/PhoneExtractingContentHandler.html">PhoneExtractingContentHandler</a>,
you can have any phone numbers found in the textual content of the document
extracted and placed into the Metadata object for you.</p><div
id="highlighter_956409" class="syntaxhighlighter nogutter java"><table
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div
class="container"><div class="line number69 index0 alt2"><code class="java
keyword">public</code> <code class="java keyword">static</code> <code
class="java keyword">void</code> <code class="java plain">process(File file)
</code><code class="java keyword">throws</code> <code class="java
plain">Exception {</code></div><div class="line number70 index1 alt1"><code
class="java spaces"> </code><code class="java
plain">Parser parser = </code><code class="java keyword">new</code> <code
class="java plain">AutoDetectParser();</code></div><div class="line number71
index2 alt2"><code class="java spaces"> </code><code
class="java plain">Metadata metadata = </code><code class="java
keyword">new</code> <code class="java plain">Metadata();</code></div><div
class="line number72 index3 alt1"><code class="java
spaces"> </code><code class="java comments">// The
PhoneExtractingContentHandler will examine any characters for phone numbers
before passing them</code></div><div class="line number73 index4 alt2"><code
class="java spaces"> </code><code class="java
comments">// to the underlying Handler.</code></div><div class="line number74
index5 alt1"><code class="java spaces"> </code><code
class="java plain">PhoneExtractingContentHandler handler = </code><code
class="java keyword">new</code> <code class="java
plain">PhoneExtractingContentHandler(</code><code class="java
keyword">new</code> <code class="java plain">BodyContentHandler(),
metadata);</code></div><div cl
ass="line number75 index6 alt2"><code class="java
spaces"> </code><code class="java plain">InputStream
stream = </code><code class="java keyword">new</code> <code class="java
plain">FileInputStream(file);</code></div><div class="line number76 index7
alt1"><code class="java spaces"> </code><code
class="java keyword">try</code> <code class="java plain">{</code></div><div
class="line number77 index8 alt2"><code class="java
spaces"> </code><code
class="java plain">parser.parse(stream, handler, metadata, </code><code
class="java keyword">new</code> <code class="java
plain">ParseContext());</code></div><div class="line number78 index9
alt1"><code class="java spaces"> </code><code
class="java plain">}</code></div><div class="line number79 index10 alt2"><code
class="java spaces"> </code><code class="java
keyword">finally</code> <code class="java plain">{
</code></div><div class="line number80 index11 alt1"><code class="java
spaces"> </code><code
class="java plain">stream.close();</code></div><div class="line number81
index12 alt2"><code class="java spaces"> </code><code
class="java plain">}</code></div><div class="line number82 index13 alt1"><code
class="java spaces"> </code><code class="java
plain">String[] numbers = metadata.getValues(</code><code class="java
string">"phonenumbers"</code><code class="java plain">);</code></div><div
class="line number83 index14 alt2"><code class="java
spaces"> </code><code class="java keyword">for</code>
<code class="java plain">(String number : numbers) {</code></div><div
class="line number84 index15 alt1"><code class="java
spaces"> </code><code
class="java plain">phoneNumbers.add(number);</code></div><div class="line
number85 index16 al
t2"><code class="java spaces"> </code><code class="java
plain">}</code></div><div class="line number86 index17 alt1"><code class="java
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
+<p>By using the <a
href="./apidocs/org/apache/tika/sax/PhoneExtractingContentHandler.html">PhoneExtractingContentHandler</a>,
you can have any phone numbers found in the textual content of the document
extracted and placed into the Metadata object for you.</p><div
id="highlighter_263937" class="syntaxhighlighter nogutter java"><table
border="0" cellpadding="0" cellspacing="0"><tbody><tr><td class="code"><div
class="container"><div class="line number69 index0 alt2"><code class="java
keyword">public</code> <code class="java keyword">static</code> <code
class="java keyword">void</code> <code class="java plain">process(File file)
</code><code class="java keyword">throws</code> <code class="java
plain">Exception {</code></div><div class="line number70 index1 alt1"><code
class="java spaces"> </code><code class="java
plain">Parser parser = </code><code class="java keyword">new</code> <code
class="java plain">AutoDetectParser();</code></div><div class="line number71
index2 alt2"><code class="java spaces"> </code><code
class="java plain">Metadata metadata = </code><code class="java
keyword">new</code> <code class="java plain">Metadata();</code></div><div
class="line number72 index3 alt1"><code class="java
spaces"> </code><code class="java comments">// The
PhoneExtractingContentHandler will examine any characters for phone numbers
before passing them</code></div><div class="line number73 index4 alt2"><code
class="java spaces"> </code><code class="java
comments">// to the underlying Handler.</code></div><div class="line number74
index5 alt1"><code class="java spaces"> </code><code
class="java plain">PhoneExtractingContentHandler handler = </code><code
class="java keyword">new</code> <code class="java
plain">PhoneExtractingContentHandler(</code><code class="java
keyword">new</code> <code class="java plain">BodyContentHandler(),
metadata);</code></div><div cl
ass="line number75 index6 alt2"><code class="java
spaces"> </code><code class="java plain">InputStream
stream = </code><code class="java keyword">new</code> <code class="java
plain">FileInputStream(file);</code></div><div class="line number76 index7
alt1"><code class="java spaces"> </code><code
class="java keyword">try</code> <code class="java plain">{</code></div><div
class="line number77 index8 alt2"><code class="java
spaces"> </code><code
class="java plain">parser.parse(stream, handler, metadata, </code><code
class="java keyword">new</code> <code class="java
plain">ParseContext());</code></div><div class="line number78 index9
alt1"><code class="java spaces"> </code><code
class="java plain">}</code></div><div class="line number79 index10 alt2"><code
class="java spaces"> </code><code class="java
keyword">finally</code> <code class="java plain">{
</code></div><div class="line number80 index11 alt1"><code class="java
spaces"> </code><code
class="java plain">stream.close();</code></div><div class="line number81
index12 alt2"><code class="java spaces"> </code><code
class="java plain">}</code></div><div class="line number82 index13 alt1"><code
class="java spaces"> </code><code class="java
plain">String[] numbers = metadata.getValues(</code><code class="java
string">"phonenumbers"</code><code class="java plain">);</code></div><div
class="line number83 index14 alt2"><code class="java
spaces"> </code><code class="java keyword">for</code>
<code class="java plain">(String number : numbers) {</code></div><div
class="line number84 index15 alt1"><code class="java
spaces"> </code><code
class="java plain">phoneNumbers.add(number);</code></div><div class="line
number85 index16 al
t2"><code class="java spaces"> </code><code class="java
plain">}</code></div><div class="line number86 index17 alt1"><code class="java
plain">}</code></div></div></td></tr></tbody></table></div></div></div>
<div class="section">
<h3><a name="Translation">Translation</a></h3>
<p>Tika provides a pluggable Translation system, which allow you to send the
results of parsing off to an external system or program to have the text
translated into another language</p>
<div class="section">
<h4><a name="Translation_using_the_Microsoft_Translation_API">Translation
using the Microsoft Translation API</a></h4>
-<p>In order to use the Microsoft Translation API, you need to sign up for an
account and get a key, then pass that to Tika when you have the translation
done.</p><div id="highlighter_451559" class="syntaxhighlighter nogutter
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td
class="code"><div class="container"><div class="line number23 index0
alt2"><code class="java keyword">public</code> <code class="java plain">String
microsoftTranslateToFrench(String text) {</code></div><div class="line number24
index1 alt1"><code class="java spaces"> </code><code
class="java plain">MicrosoftTranslator translator = </code><code class="java
keyword">new</code> <code class="java
plain">MicrosoftTranslator();</code></div><div class="line number25 index2
alt2"><code class="java spaces"> </code><code
class="java comments">// Change the id and secret! See <a
href="http://msdn.microsoft.com/en-us/library/hh454950.aspx.">http://msdn.micro
soft.com/en-us/library/hh454950.aspx.</a></code></div><div class="line
number26 index3 alt1"><code class="java
spaces"> </code><code class="java
plain">translator.setId(</code><code class="java string">"dummy-id"</code><code
class="java plain">);</code></div><div class="line number27 index4 alt2"><code
class="java spaces"> </code><code class="java
plain">translator.setSecret(</code><code class="java
string">"dummy-secret"</code><code class="java plain">);</code></div><div
class="line number28 index5 alt1"><code class="java
spaces"> </code><code class="java keyword">try</code>
<code class="java plain">{</code></div><div class="line number29 index6
alt2"><code class="java
spaces"> </code><code
class="java keyword">return</code> <code class="java
plain">translator.translate(text, </code><code class="java
string">"fr"</code><code class="java plain">);</code></div><div clas
s="line number30 index7 alt1"><code class="java
spaces"> </code><code class="java plain">} </code><code
class="java keyword">catch</code> <code class="java plain">(Exception e)
{</code></div><div class="line number31 index8 alt2"><code class="java
spaces"> </code><code
class="java keyword">return</code> <code class="java string">"Error while
translating."</code><code class="java plain">;</code></div><div class="line
number32 index9 alt1"><code class="java
spaces"> </code><code class="java
plain">}</code></div><div class="line number33 index10 alt2"><code class="java
plain">}</code></div></div></td></tr></tbody></table></div></div></div></div>
+<p>In order to use the Microsoft Translation API, you need to sign up for an
account and get a key, then pass that to Tika when you have the translation
done.</p><div id="highlighter_225487" class="syntaxhighlighter nogutter
java"><table border="0" cellpadding="0" cellspacing="0"><tbody><tr><td
class="code"><div class="container"><div class="line number23 index0
alt2"><code class="java keyword">public</code> <code class="java plain">String
microsoftTranslateToFrench(String text) {</code></div><div class="line number24
index1 alt1"><code class="java spaces"> </code><code
class="java plain">MicrosoftTranslator translator = </code><code class="java
keyword">new</code> <code class="java
plain">MicrosoftTranslator();</code></div><div class="line number25 index2
alt2"><code class="java spaces"> </code><code
class="java comments">// Change the id and secret! See <a
href="http://msdn.microsoft.com/en-us/library/hh454950.aspx.">http://msdn.micro
soft.com/en-us/library/hh454950.aspx.</a></code></div><div class="line
number26 index3 alt1"><code class="java
spaces"> </code><code class="java
plain">translator.setId(</code><code class="java string">"dummy-id"</code><code
class="java plain">);</code></div><div class="line number27 index4 alt2"><code
class="java spaces"> </code><code class="java
plain">translator.setSecret(</code><code class="java
string">"dummy-secret"</code><code class="java plain">);</code></div><div
class="line number28 index5 alt1"><code class="java
spaces"> </code><code class="java keyword">try</code>
<code class="java plain">{</code></div><div class="line number29 index6
alt2"><code class="java
spaces"> </code><code
class="java keyword">return</code> <code class="java
plain">translator.translate(text, </code><code class="java
string">"fr"</code><code class="java plain">);</code></div><div clas
s="line number30 index7 alt1"><code class="java
spaces"> </code><code class="java plain">} </code><code
class="java keyword">catch</code> <code class="java plain">(Exception e)
{</code></div><div class="line number31 index8 alt2"><code class="java
spaces"> </code><code
class="java keyword">return</code> <code class="java string">"Error while
translating."</code><code class="java plain">;</code></div><div class="line
number32 index9 alt1"><code class="java
spaces"> </code><code class="java
plain">}</code></div><div class="line number33 index10 alt2"><code class="java
plain">}</code></div></div></td></tr></tbody></table></div></div></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/src/site/apt/1.7/examples.apt
URL:
http://svn.apache.org/viewvc/tika/site/src/site/apt/1.7/examples.apt?rev=1646921&r1=1646920&r2=1646921&view=diff
==============================================================================
--- tika/site/src/site/apt/1.7/examples.apt (original)
+++ tika/site/src/site/apt/1.7/examples.apt Sat Dec 20 07:12:41 2014
@@ -24,20 +24,45 @@ Apache Tika API Usage Examples
{{{https://svn.apache.org/repos/asf/tika/trunk/tika-example}Tika Example
module}} in SVN.
- TODO Complete
-
%{toc|section=1|fromDepth=1}
* {Parsing}
- TODO Explain the options
+ Tika provides a number of different ways to trigger the parsing
+ of a file. These provide different levels of control and flexibility,
+ with varying levels of complexity to trigger.
** {Parsing using the Tika Facade}
- TODO Explain about using this
+ The {{{./apidocs/org/apache/tika/Tika.html}Tika facade}},
+ provides a number of very quick and easy ways to have your content
+ parsed by Tika, and return the resulting plain text
%{include|source=src/examples-src/main/java/org/apache/tika/example/ParsingExample.java|snippet=aj:..parseToStringExample()|show-gutter=false}
+** {Parsing using the Auto-Detect Parser}
+
+ For more control, you can call the
+ {{{./apidocs/org/apache/tika/parser/Parser.html}Tika Parsers}}
+ directly. Most likely, you'll want to start out using the
+ {{{./apidocs/org/apache/tika/parser/AutoDetectParser.html}Auto-Detect
Parser}},
+ which works out what kind of content you have, then finds an appropriate
+ parser to call for you.
+
+%{include|source=src/examples-src/main/java/org/apache/tika/example/ParsingExample.java|snippet=aj:..parseExample()|show-gutter=false}
+
+* {Picking different output types}
+
+ With Tika, you can get the textual content of your files returned
+ in a number of different formats. These can be plain text, html, xhtml,
+ xhtml of one part of the file etc. This is controlled based on the
+
{{{http://docs.oracle.com/javase/7/docs/api/org/xml/sax/ContentHandler.html}ContentHandler}}
+ you supply to the Parser.
+
+** {Parsing to Plain Text}
+
+ TODO
+
* {Custom Content Handlers}
The textual output of parsing a file with Tika is returned via the SAX