Author: nick
Date: Sun May 26 16:57:55 2013
New Revision: 1486432
URL: http://svn.apache.org/r1486432
Log:
Expand/update the Source Code Repo and Text Extractor docs
Modified:
poi/site/src/documentation/content/xdocs/subversion.xml
poi/site/src/documentation/content/xdocs/text-extraction.xml
Modified: poi/site/src/documentation/content/xdocs/subversion.xml
URL:
http://svn.apache.org/viewvc/poi/site/src/documentation/content/xdocs/subversion.xml?rev=1486432&r1=1486431&r2=1486432&view=diff
==============================================================================
--- poi/site/src/documentation/content/xdocs/subversion.xml (original)
+++ poi/site/src/documentation/content/xdocs/subversion.xml Sun May 26 16:57:55
2013
@@ -31,22 +31,31 @@
<section><title>Download the Source</title>
<p>
Most users of the source code probably don't need to have day to
- day access to the source code as it changes. For these users we
- provide easy to unpack source code from releases via our
+ day access to the source code as it changes. Most users will want
+ to make use of our <link href="download.html">source release</link>
+ packages, which contain the complete source tree for each binary
+ release, suitable for browsing or debugging. These source releases
+ are available from our
<link href="download.html">download page.</link>
</p>
+ <p>
+ The Apache POI sourcecode is also available as source artifacts
+ in the Maven Central repository, which may be helpful for those
+ users who make use of POI and wish to inspect the source (eg when
+ debugging in an IDE).
+ </p>
</section>
<section><title>Access the Version Controlled Source Code</title>
<p>
- For information on connecting to the ASF Subversion repositories,
- see the
+ For general information on connecting to the ASF Subversion,
+ repositories, see the
<link href="http://www.apache.org/dev/version-control.html">version
control page.</link>
</p>
<p>Subversion is an open-source version control system. It has been
contributed to the Apache Software Foundation and is
- now available <link
href="http://incubator.apache.org/projects/subversion.html">here</link>.
+ now available <link href="http://subversion.apache.org/">here</link>.
</p>
<p>
- The root url of the ASF Subversion repository is
+ The root url of the ASF Subversion repository is
<link
href="http://svn.apache.org/repos/asf/">http://svn.apache.org/repos/asf/</link>
for non-committers and
<link
href="https://svn.apache.org/repos/asf/">https://svn.apache.org/repos/asf/</link>
@@ -75,11 +84,30 @@
</section>
<section><title>Git access to POI sources </title>
<p>
- Git read-only access to POI sources is now available.
- Please see the <link href="http://git.apache.org/">Git at
Apache</link> page for details.
- Git Clone URL: <link
href="git://git.apache.org/poi.git">git://git.apache.org/poi.git</link>
- and Http Clone URL: <link
href="http://git.apache.org/poi.git">http://git.apache.org/poi.git</link>.
+ The master source repository for Apache POI is the Subversion
+ one listed above. To support those users and developers who prefer
+ to use the Git tooling, read-only access to the POI source tree is
+ also available via Git. The Git mirrors normally track SVN to
+ within a few minutes.
</p>
+ <p>
+ The official read-only Git repository for Apache POI is available
+ from <link href="http://git.apache.org/">git.apache.org/</link> .
+ The Git Clone URL is: <link
href="git://git.apache.org/poi.git">git://git.apache.org/poi.git</link>
+ and Http Clone URL: <link
href="http://git.apache.org/poi.git">http://git.apache.org/poi.git</link> .
+ Please see the <link href="http://git.apache.org/">Git at
+ Apache</link> page for more details on the service.
+ </p>
+ <p>
+ In addition to the <link
href="http://git.apache.org/">git.apache.org/</link>
+ repository, changes are also mirrored in near-realtime to GitHub.
+ The GitHub repository is available at
+ <link
href="https://github.com/apache/poi">https://github.com/apache/poi</link> .
+ Please note that the GitHub repository is read-only, and all
+ contributions should continue to be sent via Bugzilla for tracking.
+ (Git patches are fine though). Please see the
+ <link href="guidelines.html">contribution guidelines</link> for more
+ information on getting involved in the project.</p>
</section>
</body>
<footer>
Modified: poi/site/src/documentation/content/xdocs/text-extraction.xml
URL:
http://svn.apache.org/viewvc/poi/site/src/documentation/content/xdocs/text-extraction.xml?rev=1486432&r1=1486431&r2=1486432&view=diff
==============================================================================
--- poi/site/src/documentation/content/xdocs/text-extraction.xml (original)
+++ poi/site/src/documentation/content/xdocs/text-extraction.xml Sun May 26
16:57:55 2013
@@ -29,14 +29,23 @@
<body>
<section><title>Overview</title>
- <p>Apache POI provides text extraction for all the supported file
- formats. In addition, it provides access to the metadata
- associated with a given file, such as title and author.</p>
- <p>In addition to providing direct text extraction classes,
- POI works closely with the
- <link href="http://incubator.apache.org/tika/">Apache Tika</link>
- text extraction library. Users may wish to simply utilise
- the functionality provided by Tika.</p>
+ <p>For a number of years now, Apache POI has provided basic
+ text extraction for all the project supported file formats. In
+ addition, as well as the (plain) text, these provides access to
+ the metadata associated with a given file, such as title and
+ author.</p>
+ <p>For more advanced text extraction needs, including Rich Text
+ extraction (such as formatting and styling), along with XML and
+ HTML output, Apache POI works closely with
+ <link href="http://tika.apache.org/">Apache Tika</link> to deliver
+ POI-powered Tika Parsers for all the project supported file formats.</p>
+ <p>If you are after turn-key text extraction, including the latest
+ support, styles etc, you are strongly advised to make use of
+ <link href="http://tika.apache.org/">Apache Tika</link>, which builds
+ on top of POI to provide Text and Metadata extraction. If you wish
+ to have something very simple and stand-alone, or you wish to make
+ heavy modificiations, then the POI provided text extractors documented
+ below might be a better fit for your needs.</p>
</section>
<section><title>Common functionality</title>
@@ -56,12 +65,16 @@
provides common methods to get at the OOXML metadata.</p>
</section>
- <section><title>Text Extractor Factory - POI 3.5 or later</title>
- <p>A new class in POI 3.5,
- <em>org.apache.poi.extractor.ExtractorFactory</em> provides a
+ <section><title>Text Extractor Factory</title>
+ <p>As part of the addition of OOXML support in Apache POI 3.5, there
+ is a common class to select the appropriate POI text extractor for
+ you. <em>org.apache.poi.extractor.ExtractorFactory</em> provides a
similar function to WorkbookFactory. You simply pass it an
- InputStream, a file, a POIFSFileSystem or a OOXML Package. It
+ InputStream, a File, a POIFSFileSystem or a OOXML Package. It
figures out the correct text extractor for you, and returns it.</p>
+ <p>For complete detection and text extractor auto-selection, users
+ are strongly encouraged to investigate
+ <link href="http://tika.apache.org/">Apache Tika</link>.</p>
</section>
<section><title>Excel</title>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]