Author: nick
Date: Wed Sep 18 13:50:40 2019
New Revision: 1867122
URL: http://svn.apache.org/viewvc?rev=1867122&view=rev
Log:
TIKA-2947 Update source code link
Modified:
tika/site/publish/0.10/gettingstarted.html
tika/site/publish/0.5/gettingstarted.html
tika/site/publish/0.6/gettingstarted.html
tika/site/publish/0.7/gettingstarted.html
tika/site/publish/0.8/gettingstarted.html
tika/site/publish/0.9/gettingstarted.html
tika/site/publish/1.0/gettingstarted.html
tika/site/publish/1.1/gettingstarted.html
tika/site/publish/1.10/gettingstarted.html
tika/site/publish/1.11/gettingstarted.html
tika/site/publish/1.12/gettingstarted.html
tika/site/publish/1.13/gettingstarted.html
tika/site/publish/1.14/gettingstarted.html
tika/site/publish/1.15/gettingstarted.html
tika/site/publish/1.16/gettingstarted.html
tika/site/publish/1.17/gettingstarted.html
tika/site/publish/1.18/gettingstarted.html
tika/site/publish/1.19.1/gettingstarted.html
tika/site/publish/1.19/gettingstarted.html
tika/site/publish/1.2/gettingstarted.html
tika/site/publish/1.20/gettingstarted.html
tika/site/publish/1.21/gettingstarted.html
tika/site/publish/1.22/gettingstarted.html
tika/site/publish/1.3/gettingstarted.html
tika/site/publish/1.4/gettingstarted.html
tika/site/publish/1.5/gettingstarted.html
tika/site/publish/1.6/gettingstarted.html
tika/site/publish/1.7/gettingstarted.html
tika/site/publish/1.8/gettingstarted.html
tika/site/publish/1.9/gettingstarted.html
Modified: tika/site/publish/0.10/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/0.10/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/0.10/gettingstarted.html (original)
+++ tika/site/publish/0.10/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>0.10</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.10</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
The listing below shows all the compile-scope dependencies of tika-parsers in
the Tika 0.10 release.</p>
<div>
<pre>org.apache.tika:tika-parsers:bundle:0.10
@@ -154,8 +151,7 @@
+- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile
+- de.l3s.boilerpipe:boilerpipe:jar:1.1.0:compile
+- rome:rome:jar:0.9:compile
-| \- jdom:jdom:jar:1.0:compile
-</pre></div></div>
+| \- jdom:jdom:jar:1.0:compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, to use Tika in you
application you can include the Tika jar files and the dependencies
individually.</p>
@@ -187,8 +183,7 @@
<pathelement location="path/to/boilerpipe-1.1.0.jar"/>
<pathelement location="path/to/rome-0.9.jar"/>
<pathelement location="path/to/jdom-1.0.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the tika-parsers source directory. This
will copy all Tika dependencies to the <tt>target/dependencies</tt>
directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -253,15 +248,13 @@ Description:
Use the "-server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-0.10.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/0.5/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/0.5/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/0.5/gettingstarted.html (original)
+++ tika/site/publish/0.5/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>x.y</version> <!-- 0.5 or higher -->
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>This dependency only gives you basic Tika functionality without any of the
parser libraries. If you want to use Tika to parse documents (instead of simply
detecting document types, etc.), you also need the tika-parsers dependency: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>x.y</version> <!-- same version as in tika-core
-->
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project. You need to make sure that these dependencies
won't conflict with your existing project dependencies. The listing below shows
all the compile-scope dependencies of the current Tika parsers release (0.5,
November 2009). You can use the command "mvn dependency:tree" to
check the latest tree of dependencies on any one of Tika's core, parsers and
app projects.</p>
<div>
<pre>org.apache.tika:tika-parent:pom:0.5
@@ -173,8 +170,7 @@ org.apache.tika:tika-app:bundle:0.5
+- org.ccil.cowan.tagsoup:tagsoup:jar:1.2:provided
+- asm:asm:jar:3.1:provided
+- log4j:log4j:jar:1.2.14:provided
- \- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:provided
-</pre></div></div>
+ \-
com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:provided</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, to use Tika in you
application you can include the Tika jar files and the dependencies
individually.</p>
@@ -202,8 +198,7 @@ org.apache.tika:tika-app:bundle:0.5
<pathelement
location="path/to/geronimo-stax-api_1.0_spec-1.0.jar"/>
<pathelement location="path/to/asm-3.1.jar"/>
<pathelement location="path/to/log4j-1.2.14.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the Tika source directory. This will copy
all Tika dependencies to the <tt>target/dependencies</tt> directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -237,15 +232,13 @@ Description:
Use the "--gui" (or "-g") option to start
the Apache Tika GUI. You can drag and drop files
from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
-</pre></div>
+ extract text content and metadata from the files.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-x.y.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/0.6/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/0.6/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/0.6/gettingstarted.html (original)
+++ tika/site/publish/0.6/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>0.6</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.6</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
The listing below shows all the compile-scope dependencies of tika-parsers in
the Tika 0.6 release.</p>
<div>
<pre>org.apache.tika:tika-parsers:bundle:0.6
@@ -146,8 +143,7 @@
+- org.ccil.cowan.tagsoup:tagsoup:jar:1.2:compile
+- asm:asm:jar:3.1:compile
+- log4j:log4j:jar:1.2.14:compile
-\- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile
-</pre></div></div>
+\- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, to use Tika in you
application you can include the Tika jar files and the dependencies
individually.</p>
@@ -173,8 +169,7 @@
<pathelement location="path/to/asm-3.1.jar"/>
<pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement
location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the tika-parsers source directory. This
will copy all Tika dependencies to the <tt>target/dependencies</tt>
directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -208,15 +203,13 @@ Description:
Use the "--gui" (or "-g") option to start
the Apache Tika GUI. You can drag and drop files
from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
-</pre></div>
+ extract text content and metadata from the files.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-0.6.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/0.7/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/0.7/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/0.7/gettingstarted.html (original)
+++ tika/site/publish/0.7/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>0.7</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.7</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
The listing below shows all the compile-scope dependencies of tika-parsers in
the Tika 0.7 release.</p>
<div>
<pre>org.apache.tika:tika-parsers:bundle:0.7
@@ -152,8 +149,7 @@
+- org.mockito:mockito-core:jar:1.7:test
| +- org.hamcrest:hamcrest-core:jar:1.1:test
| \- org.objenesis:objenesis:jar:1.0:test
-\- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile
-</pre></div></div>
+\- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, to use Tika in you
application you can include the Tika jar files and the dependencies
individually.</p>
@@ -179,8 +175,7 @@
<pathelement location="path/to/asm-3.1.jar"/>
<pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement
location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the tika-parsers source directory. This
will copy all Tika dependencies to the <tt>target/dependencies</tt>
directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -214,15 +209,13 @@ Description:
Use the "--gui" (or "-g") option to start
the Apache Tika GUI. You can drag and drop files
from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
-</pre></div>
+ extract text content and metadata from the files.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-0.7.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/0.8/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/0.8/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/0.8/gettingstarted.html (original)
+++ tika/site/publish/0.8/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>0.8</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.8</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
The listing below shows all the compile-scope dependencies of tika-parsers in
the Tika 0.8 release.</p>
<div>
<pre>org.apache.tika:tika-parsers:bundle:0.8
@@ -146,8 +143,7 @@
+- org.ccil.cowan.tagsoup:tagsoup:jar:1.2:compile
+- asm:asm:jar:3.1:compile
+- log4j:log4j:jar:1.2.14:compile
-\- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile
-</pre></div></div>
+\- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, to use Tika in you
application you can include the Tika jar files and the dependencies
individually.</p>
@@ -173,8 +169,7 @@
<pathelement location="path/to/asm-3.1.jar"/>
<pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement
location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the tika-parsers source directory. This
will copy all Tika dependencies to the <tt>target/dependencies</tt>
directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -208,15 +203,13 @@ Description:
Use the "--gui" (or "-g") option to start
the Apache Tika GUI. You can drag and drop files
from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
-</pre></div>
+ extract text content and metadata from the files.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-0.8.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/0.9/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/0.9/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/0.9/gettingstarted.html (original)
+++ tika/site/publish/0.9/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>0.9</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
The listing below shows all the compile-scope dependencies of tika-parsers in
the Tika 0.9 release.</p>
<div>
<pre>\- org.apache.tika:tika-parsers:jar:0.9:provided
@@ -153,8 +150,7 @@
+- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:provided
+- de.l3s.boilerpipe:boilerpipe:jar:1.1.0:provided
\- rome:rome:jar:0.9:provided
- \- jdom:jdom:jar:1.0:provided
-</pre></div></div>
+ \- jdom:jdom:jar:1.0:provided</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, to use Tika in you
application you can include the Tika jar files and the dependencies
individually.</p>
@@ -180,8 +176,7 @@
<pathelement location="path/to/asm-3.1.jar"/>
<pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement
location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the tika-parsers source directory. This
will copy all Tika dependencies to the <tt>target/dependencies</tt>
directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -215,15 +210,13 @@ Description:
Use the "--gui" (or "-g") option to start
the Apache Tika GUI. You can drag and drop files
from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
-</pre></div>
+ extract text content and metadata from the files.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-0.9.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/1.0/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.0/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.0/gettingstarted.html (original)
+++ tika/site/publish/1.0/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.0</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.0</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
The listing below shows all the compile-scope dependencies of tika-parsers in
the Tika 1.0 release.</p>
<div>
<pre> org.apache.tika:tika-parsers:bundle:1.0
@@ -154,8 +151,7 @@
+- com.drewnoakes:metadata-extractor:jar:2.4.0-beta-1:compile
+- de.l3s.boilerpipe:boilerpipe:jar:1.1.0:compile
+- rome:rome:jar:0.9:compile
- \- jdom:jdom:jar:1.0:compile
-</pre></div></div>
+ \- jdom:jdom:jar:1.0:compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, to use Tika in you
application you can include the Tika jar files and the dependencies
individually.</p>
@@ -181,8 +177,7 @@
<pathelement location="path/to/asm-3.1.jar"/>
<pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement
location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the tika-parsers source directory. This
will copy all Tika dependencies to the <tt>target/dependencies</tt>
directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -216,15 +211,13 @@ Description:
Use the "--gui" (or "-g") option to start
the Apache Tika GUI. You can drag and drop files
from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
-</pre></div>
+ extract text content and metadata from the files.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-1.0.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/1.1/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.1/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.1/gettingstarted.html (original)
+++ tika/site/publish/1.1/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 5 or higher to build Tika.</p></div>
<div class="section">
@@ -116,16 +115,14 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.1</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.1</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
The listing below shows all the compile-scope dependencies of tika-parsers in
the Tika 1.1 release.</p>
<div>
<pre>+- org.apache.tika:tika-core:jar:1.1:compile
@@ -167,7 +164,6 @@
| \- org.objenesis:objenesis:jar:1.0:test
\- org.slf4j:slf4j-log4j12:jar:1.5.6:test
\- log4j:log4j:jar:1.2.14:test
-
</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
@@ -194,8 +190,7 @@
<pathelement location="path/to/asm-3.1.jar"/>
<pathelement location="path/to/log4j-1.2.14.jar"/>
<pathelement
location="path/to/metadata-extractor-2.4.0-beta-1.jar"/>
-</classpath>
-</pre></div>
+</classpath></pre></div>
<p>An easy way to gather all these libraries is to run "mvn
dependency:copy-dependencies" in the tika-parsers source directory. This
will copy all Tika dependencies to the <tt>target/dependencies</tt>
directory.</p>
<p>Alternatively you can simply drop the entire tika-app jar to your classpath
to get all of the above dependencies in a single archive.</p></div>
<div class="section">
@@ -229,15 +224,13 @@ Description:
Use the "--gui" (or "-g") option to start
the Apache Tika GUI. You can drag and drop files
from a normal file explorer to the GUI window to
- extract text content and metadata from the files.
-</pre></div>
+ extract text content and metadata from the files.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app-1.1.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
</div>
<div id="sidebar">
<div id="navigation">
Modified: tika/site/publish/1.10/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.10/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.10/gettingstarted.html (original)
+++ tika/site/publish/1.10/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -118,20 +117,17 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>...</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on tika-parsers instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>...</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>Unless you use a dependency manager tool like <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a>, the easiest way to use Tika
is to include either the tika-core or the tika-app jar in your classpath,
depending on whether you want just the core functionality or also all the
parser implementations.</p>
@@ -144,8 +140,7 @@
<!-- or: -->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -215,15 +210,13 @@ Description:
Use the "--server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>
Modified: tika/site/publish/1.11/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.11/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.11/gettingstarted.html (original)
+++ tika/site/publish/1.11/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -118,36 +117,31 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.11</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on <tt> tika-parsers </tt>
instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.11</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_a_Gradle-built_project"></a>Using Tika in a
Gradle-built project</h2>
<p>To add a dependency on Apache Tika to your Gradle built project, including
the full set of parsers, you should depend on the <tt> tika-parsers </tt>
artifact:</p>
<div>
<pre>dependencies {
runtime 'org.apache.tika:tika-parsers:1.11'
-}
-</pre></div></div>
+}</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>If you are using <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a> as your dependency manager
tool with Ant, then to include Tika with the full set of parsers, you should
depend on the <tt> tika-parsers </tt> artifact like this:</p>
<div>
<pre> <dependencies>
<dependency org="org.apache.tika"
name="tika-parsers" rev="1.11"/>
- </dependencies>
-</pre></div>
+ </dependencies></pre></div>
<p>Otherwise, probably the easiest way to use Tika is to include the full <tt>
tika-app </tt> jar on your classpath. For just core functionality, you can add
the <tt> tika-core </tt> jar, but be aware that the full set of parsers have a
large number of dependencies which must be included which is very fiddly to do
by hand with Ant! To include Tika in your Ant project, you should do something
like:</p>
<div>
<pre><classpath>
@@ -158,8 +152,7 @@
<!-- or: Tika with all Parsers-->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -229,15 +222,13 @@ Description:
Use the "--server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>
Modified: tika/site/publish/1.12/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.12/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.12/gettingstarted.html (original)
+++ tika/site/publish/1.12/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -118,36 +117,31 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.12</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on <tt> tika-parsers </tt>
instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.12</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_a_Gradle-built_project"></a>Using Tika in a
Gradle-built project</h2>
<p>To add a dependency on Apache Tika to your Gradle built project, including
the full set of parsers, you should depend on the <tt> tika-parsers </tt>
artifact:</p>
<div>
<pre>dependencies {
runtime 'org.apache.tika:tika-parsers:1.12'
-}
-</pre></div></div>
+}</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>If you are using <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a> as your dependency manager
tool with Ant, then to include Tika with the full set of parsers, you should
depend on the <tt> tika-parsers </tt> artifact like this:</p>
<div>
<pre> <dependencies>
<dependency org="org.apache.tika"
name="tika-parsers" rev="1.12"/>
- </dependencies>
-</pre></div>
+ </dependencies></pre></div>
<p>Otherwise, probably the easiest way to use Tika is to include the full <tt>
tika-app </tt> jar on your classpath. For just core functionality, you can add
the <tt> tika-core </tt> jar, but be aware that the full set of parsers have a
large number of dependencies which must be included which is very fiddly to do
by hand with Ant! To include Tika in your Ant project, you should do something
like:</p>
<div>
<pre><classpath>
@@ -158,8 +152,7 @@
<!-- or: Tika with all Parsers-->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -229,15 +222,13 @@ Description:
Use the "--server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>
Modified: tika/site/publish/1.13/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.13/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.13/gettingstarted.html (original)
+++ tika/site/publish/1.13/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -118,36 +117,31 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.13</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on <tt> tika-parsers </tt>
instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.13</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_a_Gradle-built_project"></a>Using Tika in a
Gradle-built project</h2>
<p>To add a dependency on Apache Tika to your Gradle built project, including
the full set of parsers, you should depend on the <tt> tika-parsers </tt>
artifact:</p>
<div>
<pre>dependencies {
runtime 'org.apache.tika:tika-parsers:1.13'
-}
-</pre></div></div>
+}</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>If you are using <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a> as your dependency manager
tool with Ant, then to include Tika with the full set of parsers, you should
depend on the <tt> tika-parsers </tt> artifact like this:</p>
<div>
<pre> <dependencies>
<dependency org="org.apache.tika"
name="tika-parsers" rev="1.13"/>
- </dependencies>
-</pre></div>
+ </dependencies></pre></div>
<p>Otherwise, probably the easiest way to use Tika is to include the full <tt>
tika-app </tt> jar on your classpath. For just core functionality, you can add
the <tt> tika-core </tt> jar, but be aware that the full set of parsers have a
large number of dependencies which must be included which is very fiddly to do
by hand with Ant! To include Tika in your Ant project, you should do something
like:</p>
<div>
<pre><classpath>
@@ -158,8 +152,7 @@
<!-- or: Tika with all Parsers-->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -229,15 +222,13 @@ Description:
Use the "--server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>
Modified: tika/site/publish/1.14/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.14/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.14/gettingstarted.html (original)
+++ tika/site/publish/1.14/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -118,36 +117,31 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.14</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on <tt> tika-parsers </tt>
instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.14</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_a_Gradle-built_project"></a>Using Tika in a
Gradle-built project</h2>
<p>To add a dependency on Apache Tika to your Gradle built project, including
the full set of parsers, you should depend on the <tt> tika-parsers </tt>
artifact:</p>
<div>
<pre>dependencies {
runtime 'org.apache.tika:tika-parsers:1.14'
-}
-</pre></div></div>
+}</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>If you are using <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a> as your dependency manager
tool with Ant, then to include Tika with the full set of parsers, you should
depend on the <tt> tika-parsers </tt> artifact like this:</p>
<div>
<pre> <dependencies>
<dependency org="org.apache.tika"
name="tika-parsers" rev="1.14"/>
- </dependencies>
-</pre></div>
+ </dependencies></pre></div>
<p>Otherwise, probably the easiest way to use Tika is to include the full <tt>
tika-app </tt> jar on your classpath. For just core functionality, you can add
the <tt> tika-core </tt> jar, but be aware that the full set of parsers have a
large number of dependencies which must be included which is very fiddly to do
by hand with Ant! To include Tika in your Ant project, you should do something
like:</p>
<div>
<pre><classpath>
@@ -158,8 +152,7 @@
<!-- or: Tika with all Parsers-->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -229,15 +222,13 @@ Description:
Use the "--server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>
Modified: tika/site/publish/1.15/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.15/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.15/gettingstarted.html (original)
+++ tika/site/publish/1.15/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -118,36 +117,31 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.15</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on <tt> tika-parsers </tt>
instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.15</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_a_Gradle-built_project"></a>Using Tika in a
Gradle-built project</h2>
<p>To add a dependency on Apache Tika to your Gradle built project, including
the full set of parsers, you should depend on the <tt> tika-parsers </tt>
artifact:</p>
<div>
<pre>dependencies {
runtime 'org.apache.tika:tika-parsers:1.15'
-}
-</pre></div></div>
+}</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>If you are using <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a> as your dependency manager
tool with Ant, then to include Tika with the full set of parsers, you should
depend on the <tt> tika-parsers </tt> artifact like this:</p>
<div>
<pre> <dependencies>
<dependency org="org.apache.tika"
name="tika-parsers" rev="1.15"/>
- </dependencies>
-</pre></div>
+ </dependencies></pre></div>
<p>Otherwise, probably the easiest way to use Tika is to include the full <tt>
tika-app </tt> jar on your classpath. For just core functionality, you can add
the <tt> tika-core </tt> jar, but be aware that the full set of parsers have a
large number of dependencies which must be included which is very fiddly to do
by hand with Ant! To include Tika in your Ant project, you should do something
like:</p>
<div>
<pre><classpath>
@@ -158,8 +152,7 @@
<!-- or: Tika with all Parsers-->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -229,15 +222,13 @@ Description:
Use the "--server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>
Modified: tika/site/publish/1.16/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.16/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.16/gettingstarted.html (original)
+++ tika/site/publish/1.16/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -118,36 +117,31 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.16</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on <tt> tika-parsers </tt>
instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.16</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_a_Gradle-built_project"></a>Using Tika in a
Gradle-built project</h2>
<p>To add a dependency on Apache Tika to your Gradle built project, including
the full set of parsers, you should depend on the <tt> tika-parsers </tt>
artifact:</p>
<div>
<pre>dependencies {
runtime 'org.apache.tika:tika-parsers:1.16'
-}
-</pre></div></div>
+}</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>If you are using <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a> as your dependency manager
tool with Ant, then to include Tika with the full set of parsers, you should
depend on the <tt> tika-parsers </tt> artifact like this:</p>
<div>
<pre> <dependencies>
<dependency org="org.apache.tika"
name="tika-parsers" rev="1.16"/>
- </dependencies>
-</pre></div>
+ </dependencies></pre></div>
<p>Otherwise, probably the easiest way to use Tika is to include the full <tt>
tika-app </tt> jar on your classpath. For just core functionality, you can add
the <tt> tika-core </tt> jar, but be aware that the full set of parsers have a
large number of dependencies which must be included which is very fiddly to do
by hand with Ant! To include Tika in your Ant project, you should do something
like:</p>
<div>
<pre><classpath>
@@ -158,8 +152,7 @@
<!-- or: Tika with all Parsers-->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -229,15 +222,13 @@ Description:
Use the "--server" (or "-s") option to start the
Apache Tika server. The server will listen to the
- ports you specify as one or more arguments.
-</pre></div>
+ ports you specify as one or more arguments.</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>
Modified: tika/site/publish/1.17/gettingstarted.html
URL:
http://svn.apache.org/viewvc/tika/site/publish/1.17/gettingstarted.html?rev=1867122&r1=1867121&r2=1867122&view=diff
==============================================================================
--- tika/site/publish/1.17/gettingstarted.html (original)
+++ tika/site/publish/1.17/gettingstarted.html Wed Sep 18 13:50:40 2019
@@ -89,11 +89,10 @@
<p>This document describes how to build Apache Tika from sources and how to
start using Tika in an application.</p></div>
<div class="section">
<h2><a name="Getting_and_building_the_sources"></a>Getting and building the
sources</h2>
-<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../source-repository.html">checkout</a> the latest sources from version
control.</p>
+<p>To build Tika from sources you first need to either <a
href="../download.html">download</a> a source release or <a
href="../contribute.html#Source_Code">checkout</a> the latest sources from
version control.</p>
<p>Once you have the sources, you can build them using the <a
class="externalLink" href="http://maven.apache.org/">Maven 2</a> build system.
Executing the following command in the base directory will build the sources
and install the resulting artifacts in your local Maven repository.</p>
<div>
-<pre>mvn install
-</pre></div>
+<pre>mvn install</pre></div>
<p>See the Maven documentation for more information about the available build
options.</p>
<p>Note that you need Java 7 or higher to build Tika.</p></div>
<div class="section">
@@ -120,36 +119,31 @@
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>1.17</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>If you want to use Tika to parse documents (instead of simply detecting
document types, etc.), you'll want to depend on <tt> tika-parsers </tt>
instead: </p>
<div>
<pre> <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.17</version>
- </dependency>
-</pre></div>
+ </dependency></pre></div>
<p>Note that adding this dependency will introduce a number of transitive
dependencies to your project, including one on tika-core. You need to make sure
that these dependencies won't conflict with your existing project dependencies.
You can use the following command in the tika-parsers directory to get a full
listing of all the dependencies.</p>
<div>
-<pre>$ mvn dependency:tree | grep :compile
-</pre></div></div>
+<pre>$ mvn dependency:tree | grep :compile</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_a_Gradle-built_project"></a>Using Tika in a
Gradle-built project</h2>
<p>To add a dependency on Apache Tika to your Gradle built project, including
the full set of parsers, you should depend on the <tt> tika-parsers </tt>
artifact:</p>
<div>
<pre>dependencies {
runtime 'org.apache.tika:tika-parsers:1.17'
-}
-</pre></div></div>
+}</pre></div></div>
<div class="section">
<h2><a name="Using_Tika_in_an_Ant_project"></a>Using Tika in an Ant
project</h2>
<p>If you are using <a class="externalLink"
href="http://ant.apache.org/ivy/">Apache Ivy</a> as your dependency manager
tool with Ant, then to include Tika with the full set of parsers, you should
depend on the <tt> tika-parsers </tt> artifact like this:</p>
<div>
<pre> <dependencies>
<dependency org="org.apache.tika"
name="tika-parsers" rev="1.17"/>
- </dependencies>
-</pre></div>
+ </dependencies></pre></div>
<p>Otherwise, probably the easiest way to use Tika is to include the full <tt>
tika-app </tt> jar on your classpath. For just core functionality, you can add
the <tt> tika-core </tt> jar, but be aware that the full set of parsers have a
large number of dependencies which must be included which is very fiddly to do
by hand with Ant! To include Tika in your Ant project, you should do something
like:</p>
<div>
<pre><classpath>
@@ -160,8 +154,7 @@
<!-- or: Tika with all Parsers-->
<pathelement
location="path/to/tika-app-${tika.version}.jar"/>
-</classpath>
-</pre></div></div>
+</classpath></pre></div></div>
<div class="section">
<h2><a name="Using_Tika_as_a_command_line_utility"></a>Using Tika as a command
line utility</h2>
<p>The Tika application jar (tika-app-*.jar) can be used as a command line
utility for extracting text content and metadata from all sorts of files. This
runnable jar contains all the dependencies it needs, so you don't need to worry
about classpath settings to run it.</p>
@@ -277,15 +270,13 @@ Batch Options:
To modify child process jvm args, prepend "J" as in:
-JXmx4g or -JDlog4j.configuration=file:log4j.xml.
-
</pre></div>
<p>You can also use the jar as a component in a Unix pipeline or as an
external tool in many scripting languages.</p>
<div>
<pre># Check if an Internet resource contains a specific keyword
curl http://.../document.doc \
| java -jar tika-app.jar --text \
- | grep -q keyword
-</pre></div></div>
+ | grep -q keyword</pre></div></div>
<div class="section">
<h2><a name="Wrappers"></a>Wrappers</h2>
<p>Several wrappers are available to use Tika in another programming language,
such as <a class="externalLink"
href="https://github.com/aviks/Taro.jl">Julia</a> or <a class="externalLink"
href="https://github.com/chrismattmann/tika-python">Python</a>.</p></div>