Author: buildbot
Date: Mon Jan  5 20:30:08 2015
New Revision: 935171

Log:
Staging update by buildbot for pdfbox

Added:
    websites/staging/pdfbox/trunk/content/1.8/
    websites/staging/pdfbox/trunk/content/1.8/architecture.html
    websites/staging/pdfbox/trunk/content/1.8/commandline.html
    websites/staging/pdfbox/trunk/content/1.8/cookbook/
    websites/staging/pdfbox/trunk/content/1.8/cookbook/documentcreation.html
    websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfacreation.html
    websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfavalidation.html
    websites/staging/pdfbox/trunk/content/1.8/cookbook/textextraction.html
    
websites/staging/pdfbox/trunk/content/1.8/cookbook/workingwithattachments.html
    websites/staging/pdfbox/trunk/content/1.8/cookbook/workingwithfonts.html
    websites/staging/pdfbox/trunk/content/1.8/cookbook/workingwithmetadata.html
    websites/staging/pdfbox/trunk/content/1.8/dependencies.html
    websites/staging/pdfbox/trunk/content/1.8/faq.html
Removed:
    websites/staging/pdfbox/trunk/content/architecture.html
    websites/staging/pdfbox/trunk/content/commandline/index.html
    websites/staging/pdfbox/trunk/content/cookbook/
    websites/staging/pdfbox/trunk/content/dependencies.html
    websites/staging/pdfbox/trunk/content/userguide/
Modified:
    websites/staging/pdfbox/trunk/content/   (props changed)
    websites/staging/pdfbox/trunk/content/building.html
    websites/staging/pdfbox/trunk/content/codingconventions.html
    websites/staging/pdfbox/trunk/content/css/site.css
    websites/staging/pdfbox/trunk/content/docs/1.8.2/pdfcoverage.html
    websites/staging/pdfbox/trunk/content/download.html
    websites/staging/pdfbox/trunk/content/errors/403.html
    websites/staging/pdfbox/trunk/content/errors/404.html
    websites/staging/pdfbox/trunk/content/ideas.html
    websites/staging/pdfbox/trunk/content/index.html
    websites/staging/pdfbox/trunk/content/mailinglists.html
    websites/staging/pdfbox/trunk/content/references.html
    websites/staging/pdfbox/trunk/content/sitemap.html
    websites/staging/pdfbox/trunk/content/support.html
    websites/staging/pdfbox/trunk/content/team.html

Propchange: websites/staging/pdfbox/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Mon Jan  5 20:30:08 2015
@@ -1 +1 @@
-1649484
+1649650

Added: websites/staging/pdfbox/trunk/content/1.8/architecture.html
==============================================================================
--- websites/staging/pdfbox/trunk/content/1.8/architecture.html (added)
+++ websites/staging/pdfbox/trunk/content/1.8/architecture.html Mon Jan  5 
20:30:08 2015
@@ -0,0 +1,280 @@
+<!DOCTYPE html>
+<html lang="en">
+
+<!--
+     
+     Licensed to the Apache Software Foundation (ASF) under one or more
+     contributor license agreements.  See the NOTICE file distributed with
+     this work for additional information regarding copyright ownership.
+     The ASF licenses this file to You under the Apache License, Version 2.0
+     (the "License"); you may not use this file except in compliance with
+     the License.  You may obtain a copy of the License at
+     
+     http://www.apache.org/licenses/LICENSE- 2.0
+     
+     Unless required by applicable law or agreed to in writing, software
+     distributed under the License is distributed on an "AS IS" BASIS,
+     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+     See the License for the specific language governing permissions and
+     limitations under the License.
+     -->
+
+<head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+
+    <title>Apache PDFBox | Architecture</title>
+
+    <link href="/bootstrap/css/bootstrap.min.css" rel="stylesheet">
+    <link href="/FontAwesome/css/font-awesome.css" rel="stylesheet">
+    <link href="/Iconic/iconic fill/iconic_fill.css" rel="stylesheet">
+    <link href="/css/pygments-github.css" rel="stylesheet">
+    
+    <link href="/css/site.css" rel="stylesheet">
+    
+    
+    
+     
+    
+    
+    <!-- Twitter Bootstrap and jQuery after this line. -->
+    <script src="//code.jquery.com/jquery-latest.js"></script>
+    <script src="/bootstrap/js/bootstrap.min.js"></script>
+</head>
+
+<body>
+    <nav class="navbar navbar-default navbar-top">
+      <div class="container">
+        <div class="navbar-header">
+          <a href="/index.html">
+            <img class="logo" src="/images/logo-head.gif">
+          </a>
+        </div>
+      </div>
+    </nav>
+    
+    <div class="container">
+        
+        <div class="row">
+            <div class="col-xs-3">
+                
+                <ul class="sidebar">
+                    <li class="sidebar-header">Apache PDFBox</li>
+                    <li><a href="/index.cgi">Overview</a></li>
+                    <li><a href="/download.cgi">Downloads</a></li>
+                    
+                    <li class="sidebar-header">Community</li>
+                    <li><a href="/support.html">Support</a></li>
+                    <li><a href="/mailinglists.html">Mailing Lists</a></li>
+                    <li><a href="/team.html">Project Team</a></li>
+                    
+                    <li class="sidebar-header">Documentation</li>
+                    <li class="sidebar-node">
+                        <a href="#">Trunk</a>
+                        <ul>
+                            <li><a href="/docs/2.0.0-SNAPSHOT/javadocs/">API 
Docs</a></li>
+                        </ul>
+                    </li>
+                    <li class="sidebar-node">
+                        <a href="#">1.8.8</a>
+                        <ul>
+                            <li><a 
href="/1.8/architecture.html">Architecture</a></li>
+                            <li><a 
href="/1.8/dependencies.html">Dependencies</a></li>
+                            <li class="dropdown">
+                                <a class="dropdown-toggle" 
data-toggle="dropdown" href="#">
+                                    Cookbook <b class="caret"></b>
+                                </a>
+                                <ul class="dropdown-menu">
+                                    <li><a 
href="/1.8/cookbook/documentcreation.html">Document Creation</a></li>
+                                    <li><a 
href="/1.8/cookbook/textextraction.html">Text Extraction</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfavalidation.html">PDF/A Validation</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithfonts.html">Working with Fonts</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithmetadata.html">Working with Metadata</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithattachments.html">Working with 
Attachments</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfacreation.html">Creating a PDF/A document</a></li>
+                                </ul>
+                            </li>
+                            <li><a href="/1.8/commandline.html">Command Line 
Tools</a></li>
+                            <li><a href="/docs/1.8.8/javadocs/">API 
Docs</a></li>
+                            <li><a href="/1.8/userguide/faq.html">FAQ</a></li>
+                        </ul>
+                    </li>
+                    
+                    <li class="sidebar-header">Development</li>
+                    <li><a href="/codingconventions.html">Coding 
Conventions</a></li>
+                    <li><a href="/building.html">Building</a></li>
+                    <li><a href="/ideas.html">Ideas</a></li>
+                    <li><a href="/references.html">References</a></li>
+
+                    <li class="sidebar-header">Apache Software Foundation</li>
+                    <li><a href="http://www.apache.org/";>Apache Software 
Foundation</a></li>
+                    <li><a 
href="http://www.apache.org/foundation/thanks.html";>ASF Sponsors</a></li>
+                    <li><a 
href="http://www.apache.org/security/";>Security</a></li>
+                </ul>
+            </div>
+            <div class="col-xs-9">
+                 <h1 id="architecture">Architecture</h1>
+<p>In order to get the most out of PDFBox it is neccessary to understand how a 
PDF document
+is organized as PDFBox was architected around the concepts layed out in the 
+ISO-32000 (PDF) Specification</p>
+<ul>
+<li><a href="http://www.iso.org/iso/catalogue_detail.htm?csnumber=51502";>ISO 
Site</a></li>
+<li><a 
href="http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf";>Adobe
 Version</a></li>
+</ul>
+<h2 id="quick-introduction-to-the-pdf-format">Quick Introduction to the PDF 
format</h2>
+<p>A PDF file is made up of a sequence of bytes. These bytes, grouped into 
tokens, 
+make up the basic objects upon which higher level objects and structures are 
built [see ISO-32000 7.3].</p>
+<p class="alert alert-info">PDFBox makes these basic objects available in the 
+*org.apache.pdfbox.cos* package (The COS Model).
+</p>
+
+<p>The organization of these objects, how to they are read and how to write 
them is defined in the file structure of the 
+PDF [see ISO-32000 7.5]. In addition a file can be encrpyted to protect the 
document's content [see ISO-32000 7.5].</p>
+<p class="alert alert-info">PDFBox handles the reading in the 
*org.apache.pdfbox.pdfparser* package. 
+Writing of PDF files is handled in the *org.apache.pdfbox.pdfwriter* package.
+</p>
+
+<p>Within the file structure basic objects are used to create a document 
structure building higher level objects such 
+as pages, bookmarks, annotations [see ISO-32000 7.7].</p>
+<p class="alert alert-info">PDFBox makes these higher level objects available 
through the 
+*org.apache.pdfbox.pdfmodel* package (The PD Model).
+</p>
+
+<p>In addition there is a COS representation available for the PD model if 
there is a need to 
+inspect the underlying structure or to handle special cases where the higher 
level PD model
+doesn't provide the functionality needed.</p>
+<p class="alert">It's always the COS model which is represented in the PDF 
file.</p>
+
+<h2 id="the-cos-model">The COS Model</h2>
+<p>As outlined above the basic PDF objects are represented in PDFBox in the 
org.apache.pdfbox.cos package.</p>
+<table>
+<thead>
+<tr>
+<th>PDF Type</th>
+<th>Description</th>
+<th>Example</th>
+<th>PDFBox class</th>
+<th>ISO 32000</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>Boolean</td>
+<td>Standard True/False values</td>
+<td>true</td>
+<td>org.apache.pdfbox.cos.COSBoolean</td>
+<td>7.3.2</td>
+</tr>
+<tr>
+<td>Number</td>
+<td>Integer and floating point numbers</td>
+<td>1 2.3</td>
+<td>org.apache.pdfbox.cos.COSInteger<br/>org.apache.pdfbox.cos.COSFloat</td>
+<td>7.3.3</td>
+</tr>
+<tr>
+<td>String</td>
+<td>A sequence of characters</td>
+<td>(This is a string)</td>
+<td>org.apache.pdfbox.cos.COSString</td>
+<td>7.3.4</td>
+</tr>
+<tr>
+<td>Name</td>
+<td>A predefined value in a PDF document, typically used as a key in a 
dictionary</td>
+<td>/Type</td>
+<td>org.apache.pdfbox.cos.COSName</td>
+<td>7.3.5</td>
+</tr>
+<tr>
+<td>Array</td>
+<td>Arrays are one-dimensional lists of objects accessed by a numeric index. 
Within an array each basic object is permitted as an entry.</td>
+<td>[549 3.14 false (Ralph) /SomeName]</td>
+<td>org.apache.pdfbox.cos.COSArray</td>
+<td>7.3.6</td>
+</tr>
+<tr>
+<td>Dictionary</td>
+<td>A map of name value pairs</td>
+<td>&lt;&lt;<br/>/Type /XObject<br/>/Name (Name)</br>/Size 1</br>&gt;&gt;</td>
+<td>org.apache.pdfbox.cos.COSDictionary</td>
+<td>7.3.7</td>
+</tr>
+<tr>
+<td>Stream</td>
+<td>A stream of data, typically compressed. This is used for page contents, 
images and embedded font streams.</td>
+<td>12 0 obj &lt;&lt; /Type /XObject &gt;&gt; stream 030004040404040404 
endstream</td>
+<td>org.apache.pdfbox.cos.COSStream</td>
+<td>7.3.8</td>
+</tr>
+<tr>
+<td>Object</td>
+<td>A wrapper to any of the other objects, this can be used to reference an 
object multiple times. An object is referenced by using two numbers, an object 
number and a generation number. Initially the generation number will be zero 
unless the object got replaced later in the stream.</td>
+<td>12 0 obj &lt;&lt; /Type /XObject &gt;&gt; endobj</td>
+<td>org.apache.pdfbox.cos.COSObject</td>
+<td></td>
+</tr>
+</tbody>
+</table>
+<p>A page in a pdf document is represented with a COSDictionary. The entries 
that are available for a page can be seen in the PDF Reference and an example 
of a page looks like this:</p>
+<div class="codehilite"><pre>&lt;&lt;
+    /Type /Page
+    /MediaBox [0 0 612 915]
+    /Contents 56 0 R
+&gt;&gt;
+</pre></div>
+
+
+<p>The information within the dictionary can be accessed using the COS 
model</p>
+<div class="codehilite"><pre><span class="n">COSDictionary</span> <span 
class="n">page</span> <span class="o">=</span> <span class="o">...;</span>
+<span class="n">COSArray</span> <span class="n">mediaBox</span> <span 
class="o">=</span> <span class="o">(</span><span class="n">COSArray</span><span 
class="o">)</span><span class="n">page</span><span class="o">.</span><span 
class="na">getDictionaryObject</span><span class="o">(</span> <span 
class="s">&quot;MediaBox&quot;</span> <span class="o">);</span>
+<span class="n">System</span><span class="o">.</span><span 
class="na">out</span><span class="o">.</span><span 
class="na">println</span><span class="o">(</span> <span 
class="s">&quot;Width:&quot;</span> <span class="o">+</span> <span 
class="n">mediaBox</span><span class="o">.</span><span 
class="na">get</span><span class="o">(</span> <span class="mi">3</span> <span 
class="o">)</span> <span class="o">);</span>
+</pre></div>
+
+
+<p>As can be seen from that little example the COS model provides a low level 
API to access 
+information within the PDF. In order to use the COS model successfully a good 
knowledge of
+the PDF specification is needed.</p>
+<h2 id="the-pd-model">The PD Model</h2>
+<p>The COS Model allows access to all aspects of a PDF document. This type of 
programming is
+tedious and error prone though because the user must know all of the names of 
the
+parameters and no helper methods are available. The PD Model was created to 
help
+alleviate this problem. Each type of object(page, font, image) has a set of 
defined
+attributes that can be available in the dictionary. 
+A PD Model class is available for each of these so that strongly typed methods 
are
+available to access the attributes. </p>
+<p>The same code from above to get the page width can be rewritten to use PD 
Model classes.</p>
+<div class="codehilite"><pre><span class="n">PDPage</span> <span 
class="n">page</span> <span class="o">=</span> <span class="o">...;</span>
+<span class="n">PDRectangle</span> <span class="n">mediaBox</span> <span 
class="o">=</span> <span class="n">page</span><span class="o">.</span><span 
class="na">getMediaBox</span><span class="o">();</span>
+<span class="n">System</span><span class="o">.</span><span 
class="na">out</span><span class="o">.</span><span 
class="na">println</span><span class="o">(</span> <span 
class="s">&quot;Width:&quot;</span> <span class="o">+</span> <span 
class="n">mediaBox</span><span class="o">.</span><span 
class="na">getWidth</span><span class="o">()</span> <span class="o">);</span>
+</pre></div>
+
+
+<p>PD Model objects sit on top of COS model. Typically, the classes in the PD 
Model will only
+store a COS object and all setter/getter methods will modify data that is 
stored in the
+COS object. For example, when you call PDPage.getLastModified() the method 
will do a
+lookup in the COSDictionary with the key "LastModified", if it is found the 
value is then
+converter to a java.util.Calendar. When PDPage.setLastModified( Calendar ) is 
called then
+the Calendar is converted to a string in the COSDictionary.</p> 
+            </div>
+        </div>
+    </div>
+
+    <footer class="footer">
+        <div class="container"
+            <div class="row">
+                <div class="span3">
+                    <!-- nothing in here on purpose -->
+                </div>
+                <div class="span9">
+                    <p>Copyright © 2009&ndash;2015 <a 
href="http://www.apache.org/";>The Apache Software Foundation</a>, Licensed 
under the <a href="http://www.apache.org/licenses/LICENSE-2.0";>Apache License, 
Version 2.0</a>.
+                        <br/>Apache PDFBox, PDFBox, Apache, the Apache feather 
logo and the Apache PDFBox project logos are trademarks of The Apache Software 
Foundation.</p>
+                </div>
+            </div>
+        </div>
+    </footer>
+
+</body>
+
+</html>

Added: websites/staging/pdfbox/trunk/content/1.8/commandline.html
==============================================================================
--- websites/staging/pdfbox/trunk/content/1.8/commandline.html (added)
+++ websites/staging/pdfbox/trunk/content/1.8/commandline.html Mon Jan  5 
20:30:08 2015
@@ -0,0 +1,675 @@
+<!DOCTYPE html>
+<html lang="en">
+
+<!--
+     
+     Licensed to the Apache Software Foundation (ASF) under one or more
+     contributor license agreements.  See the NOTICE file distributed with
+     this work for additional information regarding copyright ownership.
+     The ASF licenses this file to You under the Apache License, Version 2.0
+     (the "License"); you may not use this file except in compliance with
+     the License.  You may obtain a copy of the License at
+     
+     http://www.apache.org/licenses/LICENSE- 2.0
+     
+     Unless required by applicable law or agreed to in writing, software
+     distributed under the License is distributed on an "AS IS" BASIS,
+     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+     See the License for the specific language governing permissions and
+     limitations under the License.
+     -->
+
+<head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+
+    <title>Apache PDFBox | Command Line Tools</title>
+
+    <link href="/bootstrap/css/bootstrap.min.css" rel="stylesheet">
+    <link href="/FontAwesome/css/font-awesome.css" rel="stylesheet">
+    <link href="/Iconic/iconic fill/iconic_fill.css" rel="stylesheet">
+    <link href="/css/pygments-github.css" rel="stylesheet">
+    
+    <link href="/css/site.css" rel="stylesheet">
+    
+    
+    
+     
+    
+    
+    <!-- Twitter Bootstrap and jQuery after this line. -->
+    <script src="//code.jquery.com/jquery-latest.js"></script>
+    <script src="/bootstrap/js/bootstrap.min.js"></script>
+</head>
+
+<body>
+    <nav class="navbar navbar-default navbar-top">
+      <div class="container">
+        <div class="navbar-header">
+          <a href="/index.html">
+            <img class="logo" src="/images/logo-head.gif">
+          </a>
+        </div>
+      </div>
+    </nav>
+    
+    <div class="container">
+        
+        <div class="row">
+            <div class="col-xs-3">
+                
+                <ul class="sidebar">
+                    <li class="sidebar-header">Apache PDFBox</li>
+                    <li><a href="/index.cgi">Overview</a></li>
+                    <li><a href="/download.cgi">Downloads</a></li>
+                    
+                    <li class="sidebar-header">Community</li>
+                    <li><a href="/support.html">Support</a></li>
+                    <li><a href="/mailinglists.html">Mailing Lists</a></li>
+                    <li><a href="/team.html">Project Team</a></li>
+                    
+                    <li class="sidebar-header">Documentation</li>
+                    <li class="sidebar-node">
+                        <a href="#">Trunk</a>
+                        <ul>
+                            <li><a href="/docs/2.0.0-SNAPSHOT/javadocs/">API 
Docs</a></li>
+                        </ul>
+                    </li>
+                    <li class="sidebar-node">
+                        <a href="#">1.8.8</a>
+                        <ul>
+                            <li><a 
href="/1.8/architecture.html">Architecture</a></li>
+                            <li><a 
href="/1.8/dependencies.html">Dependencies</a></li>
+                            <li class="dropdown">
+                                <a class="dropdown-toggle" 
data-toggle="dropdown" href="#">
+                                    Cookbook <b class="caret"></b>
+                                </a>
+                                <ul class="dropdown-menu">
+                                    <li><a 
href="/1.8/cookbook/documentcreation.html">Document Creation</a></li>
+                                    <li><a 
href="/1.8/cookbook/textextraction.html">Text Extraction</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfavalidation.html">PDF/A Validation</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithfonts.html">Working with Fonts</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithmetadata.html">Working with Metadata</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithattachments.html">Working with 
Attachments</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfacreation.html">Creating a PDF/A document</a></li>
+                                </ul>
+                            </li>
+                            <li><a href="/1.8/commandline.html">Command Line 
Tools</a></li>
+                            <li><a href="/docs/1.8.8/javadocs/">API 
Docs</a></li>
+                            <li><a href="/1.8/userguide/faq.html">FAQ</a></li>
+                        </ul>
+                    </li>
+                    
+                    <li class="sidebar-header">Development</li>
+                    <li><a href="/codingconventions.html">Coding 
Conventions</a></li>
+                    <li><a href="/building.html">Building</a></li>
+                    <li><a href="/ideas.html">Ideas</a></li>
+                    <li><a href="/references.html">References</a></li>
+
+                    <li class="sidebar-header">Apache Software Foundation</li>
+                    <li><a href="http://www.apache.org/";>Apache Software 
Foundation</a></li>
+                    <li><a 
href="http://www.apache.org/foundation/thanks.html";>ASF Sponsors</a></li>
+                    <li><a 
href="http://www.apache.org/security/";>Security</a></li>
+                </ul>
+            </div>
+            <div class="col-xs-9">
+                 <h1 id="command-line-tools">Command Line Tools</h1>
+<p>PDFBox comes with a series of command line utilities. They are available as 
standard Java applications.</p>
+<p>See the Dependencies page for instructions on how to set your classpath in 
order to run 
+PDFBox tools as Java applications.</p>
+<div class="toc">
+<ul>
+<li><a href="#command-line-tools">Command Line Tools</a><ul>
+<li><a href="#decrypt">Decrypt</a></li>
+<li><a href="#encrypt">Encrypt</a></li>
+<li><a href="#extractText">ExtractText</a></li>
+<li><a href="#overlayPDF">OverlayPDF</a></li>
+<li><a href="#printPDF">PrintPDF</a></li>
+<li><a href="#pdfDebugger">PDFDebugger</a></li>
+<li><a href="#pdfReader">PDFReader</a></li>
+<li><a href="#pdfMerger">PDFMerger</a></li>
+<li><a href="#pdfSplit">PDFSplit</a></li>
+<li><a href="#pdfToImage">PDFToImage</a></li>
+<li><a href="#textToPDF">TextToPDF</a></li>
+<li><a href="#writeDecodeDoc">WriteDecodedDoc</a></li>
+</ul>
+</li>
+</ul>
+</div>
+<h2 id="decrypt">Decrypt</h2>
+<p>This application will decrypt a PDF document.</p>
+<p>NOTE: You must have the owner password to decrypt the document!</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar Decrypt [OPTIONS] 
&lt;inputfile&gt; [outputfile]</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td>Password to the PDF or certificate in keystore.</td>
+</tr>
+<tr>
+<td>-keyStore</td>
+<td>Path to keystore that holds certificate to decrypt the document. This is 
only required if the document is encrypted with a certificate, otherwise only 
the password is required.</td>
+</tr>
+<tr>
+<td>-alias</td>
+<td>The alias to the certificate in the keystore.</td>
+</tr>
+<tr>
+<td>inputfile</td>
+<td>The PDF file to decrypt.</td>
+</tr>
+<tr>
+<td>outputfile</td>
+<td>The file to save the decrypted document to. If left blank then it will be 
the same as the input file.</td>
+</tr>
+</tbody>
+</table>
+<h2 id="encrypt">Encrypt</h2>
+<p>This application will encrypt a PDF document.</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar Encrypt [OPTIONS] 
&lt;password&gt; &lt;inputfile&gt;</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-O</td>
+<td></td>
+<td>The owner password to the PDF, ignored if -certFile is specified.</td>
+</tr>
+<tr>
+<td>-U</td>
+<td></td>
+<td>The user password to the PDF, ignored if -certFile is specified.</td>
+</tr>
+<tr>
+<td>-certFile</td>
+<td></td>
+<td>Path to X.509 cert file.</td>
+</tr>
+<tr>
+<td>-canAssemble</td>
+<td>true</td>
+<td>Set the assemble permission.</td>
+</tr>
+<tr>
+<td>-canExtractContent</td>
+<td>true</td>
+<td>Set the extraction permission.</td>
+</tr>
+<tr>
+<td>-canExtractForAccessibility</td>
+<td>true</td>
+<td>Set the extraction permission.</td>
+</tr>
+<tr>
+<td>-canFillInForm</td>
+<td>true</td>
+<td>Set the fill in form permission.</td>
+</tr>
+<tr>
+<td>-canModify</td>
+<td>true</td>
+<td>Set the modify permission.</td>
+</tr>
+<tr>
+<td>-canModifyAnnotations</td>
+<td>true</td>
+<td>Set the modify annots permission.</td>
+</tr>
+<tr>
+<td>-canPrint</td>
+<td>true</td>
+<td>Set the print permission.</td>
+</tr>
+<tr>
+<td>-canPrintDegraded</td>
+<td>true</td>
+<td>Set the print degraded permission.</td>
+</tr>
+<tr>
+<td>-keyLength</td>
+<td>40</td>
+<td>The number of bits for the encryption key.</td>
+</tr>
+<tr>
+<td>inputfile</td>
+<td>The PDF file to encrypt.</td>
+<td></td>
+</tr>
+<tr>
+<td>outputfile</td>
+<td>The file to save the encrypted document to. If left blank then it will be 
the same as the input file.</td>
+<td></td>
+</tr>
+</tbody>
+</table>
+<h2 id="extractText">ExtractText</h2>
+<p>This application will extract all text from the given PDF document.</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar ExtractText [OPTIONS] 
&lt;inputfile&gt; [Text file]</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td></td>
+<td>The password to the PDF document.</td>
+</tr>
+<tr>
+<td>-encoding</td>
+<td>default encoding</td>
+<td>The encoding type of the text file, e.g. ISO-8859-1, UTF-8, UTF-16BE.</td>
+</tr>
+<tr>
+<td>-console</td>
+<td>false</td>
+<td>Send text to console instead of file.</td>
+</tr>
+<tr>
+<td>-html</td>
+<td>false</td>
+<td>Output in HTML format instead of raw text.</td>
+</tr>
+<tr>
+<td>-sort</td>
+<td>false</td>
+<td>Sort the text before writing.</td>
+</tr>
+<tr>
+<td>-ignoreBeads</td>
+<td>false</td>
+<td>Disables the separation by beads.</td>
+</tr>
+<tr>
+<td>-force</td>
+<td>false</td>
+<td>Enables pdfbox to ignore corrupt objects.</td>
+</tr>
+<tr>
+<td>-debug</td>
+<td>false</td>
+<td>Enables debug output about the time consumption of every stage.</td>
+</tr>
+<tr>
+<td>-startPage</td>
+<td>1</td>
+<td>The first page to extract, one based.</td>
+</tr>
+<tr>
+<td>-endPage</td>
+<td>Integer.MAX_INT</td>
+<td>The last page to extract, one based.</td>
+</tr>
+<tr>
+<td>-nonSeq</td>
+<td>false</td>
+<td>Use the new non sequential parser.</td>
+</tr>
+</tbody>
+</table>
+<h2 id="overlayPDF">OverlayPDF</h2>
+<p>This application will overlay one document with the content of another 
document</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar OverlayPDF &lt;input.pdf&gt; 
[OPTIONS] &lt;output.pdf&gt;</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>inputfile</td>
+<td></td>
+<td>The PDF file to be overlayed.</td>
+</tr>
+<tr>
+<td>defaultOverlay.pdf</td>
+<td></td>
+<td>Default overlay file.</td>
+</tr>
+<tr>
+<td>-odd oddPageOverlay.pdf</td>
+<td></td>
+<td>Overlay file used for odd pages.</td>
+</tr>
+<tr>
+<td>-even evenPageOverlay.pdf</td>
+<td></td>
+<td>Overlay file used for even pages.</td>
+</tr>
+<tr>
+<td>-first firstPageOverlay.pdf</td>
+<td></td>
+<td>Overlay file used for the first page.</td>
+</tr>
+<tr>
+<td>-last lastPageOverlay.pdf</td>
+<td></td>
+<td>Overlay file used for the last pages.</td>
+</tr>
+<tr>
+<td>-page pageNumber specificPageOverlay.pdf</td>
+<td></td>
+<td>overlay file used for the given page number, may occur more than once.</td>
+</tr>
+<tr>
+<td>-position</td>
+<td>background</td>
+<td>Where to put the overlay, foreground or background.</td>
+</tr>
+<tr>
+<td>-nonSeq</td>
+<td>false</td>
+<td>Use the new non sequential parser.</td>
+</tr>
+<tr>
+<td>outputfile</td>
+<td></td>
+<td>The resulting pdf file.</td>
+</tr>
+</tbody>
+</table>
+<p>Examples:</p>
+<ul>
+<li>OverlayPDF input.pdf overlay.pdf -nonSeq output.pdf</li>
+<li>OverlayPDF input.pdf defaultOverlay.pdf -page 10 overlayForPage10.pdf 
-position foreground -nonSeq output.pdf</li>
+<li>OverlayPDF input.pdf -odd oddOverlay.pdf -even evenOverlay.pdf -nonSeq 
output.pdf</li>
+</ul>
+<h2 id="printPDF">PrintPDF</h2>
+<p>This application will send a pdf document to the printer.</p>
+<p class="alert alert-info">You must have the correct permissions to print the 
document!</p>
+
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar PrintPDF [OPTIONS] 
&lt;inputfile&gt;</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td>The password to decrypt the PDF.</td>
+</tr>
+<tr>
+<td>-silentPrint</td>
+<td>Print the PDF without prompting for a printer.</td>
+</tr>
+<tr>
+<td>inputfile</td>
+<td>The PDF file to print.</td>
+</tr>
+</tbody>
+</table>
+<h2 id="pdfDebugger">PDFDebugger</h2>
+<p>This application will take an existing PDF document and allows to analyze 
and inspect the internal structure</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar PDFDebugger 
[inputfile]</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td></td>
+<td>The password to the PDF document.</td>
+</tr>
+<tr>
+<td>-nonSeq</td>
+<td>false</td>
+<td>Use the new non sequential parser.</td>
+</tr>
+<tr>
+<td>inputfile</td>
+<td></td>
+<td>the name of an optional PDF file to open.</td>
+</tr>
+</tbody>
+</table>
+<h2 id="pdfReader">PDFReader</h2>
+<p>An application to read PDF documents. This will provide Acrobat Reader like 
functionality.</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar PDFReader [PDF file]</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td></td>
+<td>The password to the PDF document.</td>
+</tr>
+<tr>
+<td>-nonSeq</td>
+<td>false</td>
+<td>Use the new non sequential parser.</td>
+</tr>
+<tr>
+<td>PDF file</td>
+<td></td>
+<td>the name of an optional PDF file to open</td>
+</tr>
+</tbody>
+</table>
+<h2 id="pdfMerger">PDFMerger</h2>
+<p>This application will take a list of pdf documents and merge them, saving 
the result in a new document.</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar PDFMerger &lt;Source PDF files 
(2 ..n)&gt; &lt;Target PDF file&gt;</code></p>
+<h2 id="pdfSplit">PDFSplit</h2>
+<p>This application will take an existing PDF document and split it into a 
number of other documents</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar PDFSplit [OPTIONS] &lt;PDF 
file&gt;</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td></td>
+<td>The password to the PDF document.</td>
+</tr>
+<tr>
+<td>-split</td>
+<td></td>
+<td>Number of pages of every splitted part of the pdf.</td>
+</tr>
+<tr>
+<td>-startPage</td>
+<td></td>
+<td>The page to start at.</td>
+</tr>
+<tr>
+<td>-endPage</td>
+<td></td>
+<td>The page to stop at.</td>
+</tr>
+<tr>
+<td>-nonSeq</td>
+<td>false</td>
+<td>Use the new non sequential parser.</td>
+</tr>
+</tbody>
+</table>
+<p>Examples:</p>
+<ul>
+<li>PDFSplit -split 2 sample_with_13_pages.pdf will split the pdf in pieces of 
2 pages each except the last which will contain 1 page only.</li>
+<li>PDFSplit -startPage 5 sample_with_13_pages.pdf will provide a pdf 
containing all pages of the source pdf starting at page 5</li>
+<li>PDFSplit -startPage 5 -endPage 10 sample_with_13_pages.pdf will provide a 
pdf containing all pages from 5 to 10 of the source pdf</li>
+<li>PDFSplit -split 2 -startPage 5 -endPage 10 sample_with_13_pages.pdf will 
provide 3 pdfs containing all pages from 5 to 10 of the source pdf 2 pages 
each</li>
+</ul>
+<h2 id="pdfToImage">PDFToImage</h2>
+<p>This application will create an image for every page in the PDF 
document.</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar PDFToImage [OPTIONS] &lt;PDF 
file&gt;</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td></td>
+<td>The password to the PDF document.</td>
+</tr>
+<tr>
+<td>-imageType</td>
+<td>jpg</td>
+<td>The image type to write to. Currently only jpg or png.</td>
+</tr>
+<tr>
+<td>-outputPrefix</td>
+<td>Name of PDF document</td>
+<td>The prefix to the image file.</td>
+</tr>
+<tr>
+<td>-startPage</td>
+<td>1</td>
+<td>The first page to convert, one based.</td>
+</tr>
+<tr>
+<td>-endPage</td>
+<td>Integer.MAX_INT</td>
+<td>The last page to convert, one based.</td>
+</tr>
+<tr>
+<td>-nonSeq</td>
+<td>false</td>
+<td>Use the new non sequential parser.</td>
+</tr>
+</tbody>
+</table>
+<h2 id="textToPDF">TextToPDF</h2>
+<p>This application will create a PDF document from a text file.</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar TextToPDF [OPTIONS] 
&lt;outputfile&gt; &lt;textfile&gt;</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-standardFont</td>
+<td>Helvetica</td>
+<td>The font to use for the text. Either this or -ttf should be specified but 
not both.</td>
+</tr>
+<tr>
+<td>-ttf</td>
+<td></td>
+<td>The TTF font to use for the text. Either this or -standardFont should be 
specified but not both.</td>
+</tr>
+<tr>
+<td>-fontSize</td>
+<td>10</td>
+<td>The size of the font to use.</td>
+</tr>
+</tbody>
+</table>
+<p>The following font names can be used for the parameter 
<code>standardFont</code>:</p>
+<ul>
+<li>Courier</li>
+<li>Courier-Bold</li>
+<li>Courier-Oblique</li>
+<li>Courier-BoldOblique</li>
+<li>Helvetica</li>
+<li>Helvetica-Bold</li>
+<li>Helvetica-Oblique</li>
+<li>Helvetica-BoldOblique</li>
+<li>Symbol</li>
+<li>Times-Bold</li>
+<li>Times-Roman</li>
+<li>Times-Italic</li>
+<li>Times-BoldItalic</li>
+<li>ZapfDingbats</li>
+</ul>
+<h2 id="writeDecodeDoc">WriteDecodedDoc</h2>
+<p>An application to decompress PDF documents.</p>
+<p>usage: <code>java -jar pdfbox-app-x.y.z.jar WriteDecodedDoc 
&lt;input-file&gt; &lt;output-file&gt;</code></p>
+<table>
+<thead>
+<tr>
+<th>Command Line Parameter</th>
+<th>Default</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>-password</td>
+<td></td>
+<td>The password to the PDF document.</td>
+</tr>
+<tr>
+<td>-nonSeq</td>
+<td>false</td>
+<td>Use the new non sequential parser.</td>
+</tr>
+<tr>
+<td><input-file></td>
+<td></td>
+<td>The PDF file to decompress</td>
+</tr>
+<tr>
+<td><output-file></td>
+<td></td>
+<td>The destination PDF file</td>
+</tr>
+</tbody>
+</table> 
+            </div>
+        </div>
+    </div>
+
+    <footer class="footer">
+        <div class="container"
+            <div class="row">
+                <div class="span3">
+                    <!-- nothing in here on purpose -->
+                </div>
+                <div class="span9">
+                    <p>Copyright © 2009&ndash;2015 <a 
href="http://www.apache.org/";>The Apache Software Foundation</a>, Licensed 
under the <a href="http://www.apache.org/licenses/LICENSE-2.0";>Apache License, 
Version 2.0</a>.
+                        <br/>Apache PDFBox, PDFBox, Apache, the Apache feather 
logo and the Apache PDFBox project logos are trademarks of The Apache Software 
Foundation.</p>
+                </div>
+            </div>
+        </div>
+    </footer>
+
+</body>
+
+</html>

Added: websites/staging/pdfbox/trunk/content/1.8/cookbook/documentcreation.html
==============================================================================
--- websites/staging/pdfbox/trunk/content/1.8/cookbook/documentcreation.html 
(added)
+++ websites/staging/pdfbox/trunk/content/1.8/cookbook/documentcreation.html 
Mon Jan  5 20:30:08 2015
@@ -0,0 +1,183 @@
+<!DOCTYPE html>
+<html lang="en">
+
+<!--
+     
+     Licensed to the Apache Software Foundation (ASF) under one or more
+     contributor license agreements.  See the NOTICE file distributed with
+     this work for additional information regarding copyright ownership.
+     The ASF licenses this file to You under the Apache License, Version 2.0
+     (the "License"); you may not use this file except in compliance with
+     the License.  You may obtain a copy of the License at
+     
+     http://www.apache.org/licenses/LICENSE- 2.0
+     
+     Unless required by applicable law or agreed to in writing, software
+     distributed under the License is distributed on an "AS IS" BASIS,
+     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+     See the License for the specific language governing permissions and
+     limitations under the License.
+     -->
+
+<head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+
+    <title>Apache PDFBox | Cookbook - Document Creation</title>
+
+    <link href="/bootstrap/css/bootstrap.min.css" rel="stylesheet">
+    <link href="/FontAwesome/css/font-awesome.css" rel="stylesheet">
+    <link href="/Iconic/iconic fill/iconic_fill.css" rel="stylesheet">
+    <link href="/css/pygments-github.css" rel="stylesheet">
+    
+    <link href="/css/site.css" rel="stylesheet">
+    
+    
+    
+     
+    
+    
+    <!-- Twitter Bootstrap and jQuery after this line. -->
+    <script src="//code.jquery.com/jquery-latest.js"></script>
+    <script src="/bootstrap/js/bootstrap.min.js"></script>
+</head>
+
+<body>
+    <nav class="navbar navbar-default navbar-top">
+      <div class="container">
+        <div class="navbar-header">
+          <a href="/index.html">
+            <img class="logo" src="/images/logo-head.gif">
+          </a>
+        </div>
+      </div>
+    </nav>
+    
+    <div class="container">
+        
+        <div class="row">
+            <div class="col-xs-3">
+                
+                <ul class="sidebar">
+                    <li class="sidebar-header">Apache PDFBox</li>
+                    <li><a href="/index.cgi">Overview</a></li>
+                    <li><a href="/download.cgi">Downloads</a></li>
+                    
+                    <li class="sidebar-header">Community</li>
+                    <li><a href="/support.html">Support</a></li>
+                    <li><a href="/mailinglists.html">Mailing Lists</a></li>
+                    <li><a href="/team.html">Project Team</a></li>
+                    
+                    <li class="sidebar-header">Documentation</li>
+                    <li class="sidebar-node">
+                        <a href="#">Trunk</a>
+                        <ul>
+                            <li><a href="/docs/2.0.0-SNAPSHOT/javadocs/">API 
Docs</a></li>
+                        </ul>
+                    </li>
+                    <li class="sidebar-node">
+                        <a href="#">1.8.8</a>
+                        <ul>
+                            <li><a 
href="/1.8/architecture.html">Architecture</a></li>
+                            <li><a 
href="/1.8/dependencies.html">Dependencies</a></li>
+                            <li class="dropdown">
+                                <a class="dropdown-toggle" 
data-toggle="dropdown" href="#">
+                                    Cookbook <b class="caret"></b>
+                                </a>
+                                <ul class="dropdown-menu">
+                                    <li><a 
href="/1.8/cookbook/documentcreation.html">Document Creation</a></li>
+                                    <li><a 
href="/1.8/cookbook/textextraction.html">Text Extraction</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfavalidation.html">PDF/A Validation</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithfonts.html">Working with Fonts</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithmetadata.html">Working with Metadata</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithattachments.html">Working with 
Attachments</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfacreation.html">Creating a PDF/A document</a></li>
+                                </ul>
+                            </li>
+                            <li><a href="/1.8/commandline.html">Command Line 
Tools</a></li>
+                            <li><a href="/docs/1.8.8/javadocs/">API 
Docs</a></li>
+                            <li><a href="/1.8/userguide/faq.html">FAQ</a></li>
+                        </ul>
+                    </li>
+                    
+                    <li class="sidebar-header">Development</li>
+                    <li><a href="/codingconventions.html">Coding 
Conventions</a></li>
+                    <li><a href="/building.html">Building</a></li>
+                    <li><a href="/ideas.html">Ideas</a></li>
+                    <li><a href="/references.html">References</a></li>
+
+                    <li class="sidebar-header">Apache Software Foundation</li>
+                    <li><a href="http://www.apache.org/";>Apache Software 
Foundation</a></li>
+                    <li><a 
href="http://www.apache.org/foundation/thanks.html";>ASF Sponsors</a></li>
+                    <li><a 
href="http://www.apache.org/security/";>Security</a></li>
+                </ul>
+            </div>
+            <div class="col-xs-9">
+                 <h1 id="document-creation">Document Creation</h1>
+<h2 id="create-a-blank-pdf">Create a blank PDF</h2>
+<p>This small sample shows how to create a new PDF document using PDFBox.</p>
+<div class="codehilite"><pre><span class="c1">// Create a new empty 
document</span>
+<span class="n">PDDocument</span> <span class="n">document</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">PDDocument</span><span class="o">();</span>
+
+<span class="c1">// Create a new blank page and add it to the document</span>
+<span class="n">PDPage</span> <span class="n">blankPage</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">PDPage</span><span class="o">();</span>
+<span class="n">document</span><span class="o">.</span><span 
class="na">addPage</span><span class="o">(</span> <span 
class="n">blankPage</span> <span class="o">);</span>
+
+<span class="c1">// Save the newly created document</span>
+<span class="n">document</span><span class="o">.</span><span 
class="na">save</span><span class="o">(</span><span 
class="s">&quot;BlankPage.pdf&quot;</span><span class="o">);</span>
+
+<span class="c1">// finally make sure that the document is properly</span>
+<span class="c1">// closed.</span>
+<span class="n">document</span><span class="o">.</span><span 
class="na">close</span><span class="o">();</span>
+</pre></div>
+
+
+<h2 id="hello-world-using-a-pdf-base-font">Hello World using a PDF base 
font</h2>
+<p>This small sample shows how to create a new document and print the text 
"Hello World" using one of the PDF base fonts.</p>
+<div class="codehilite"><pre><span class="c1">// Create a document and add a 
page to it</span>
+<span class="n">PDDocument</span> <span class="n">document</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">PDDocument</span><span class="o">();</span>
+<span class="n">PDPage</span> <span class="n">page</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">PDPage</span><span class="o">();</span>
+<span class="n">document</span><span class="o">.</span><span 
class="na">addPage</span><span class="o">(</span> <span class="n">page</span> 
<span class="o">);</span>
+
+<span class="c1">// Create a new font object selecting one of the PDF base 
fonts</span>
+<span class="n">PDFont</span> <span class="n">font</span> <span 
class="o">=</span> <span class="n">PDType1Font</span><span 
class="o">.</span><span class="na">HELVETICA_BOLD</span><span class="o">;</span>
+
+<span class="c1">// Start a new content stream which will &quot;hold&quot; the 
to be created content</span>
+<span class="n">PDPageContentStream</span> <span 
class="n">contentStream</span> <span class="o">=</span> <span 
class="k">new</span> <span class="n">PDPageContentStream</span><span 
class="o">(</span><span class="n">document</span><span class="o">,</span> <span 
class="n">page</span><span class="o">);</span>
+
+<span class="c1">// Define a text content stream using the selected font, 
moving the cursor and drawing the text &quot;Hello World&quot;</span>
+<span class="n">contentStream</span><span class="o">.</span><span 
class="na">beginText</span><span class="o">();</span>
+<span class="n">contentStream</span><span class="o">.</span><span 
class="na">setFont</span><span class="o">(</span> <span 
class="n">font</span><span class="o">,</span> <span class="mi">12</span> <span 
class="o">);</span>
+<span class="n">contentStream</span><span class="o">.</span><span 
class="na">moveTextPositionByAmount</span><span class="o">(</span> <span 
class="mi">100</span><span class="o">,</span> <span class="mi">700</span> <span 
class="o">);</span>
+<span class="n">contentStream</span><span class="o">.</span><span 
class="na">drawString</span><span class="o">(</span> <span 
class="s">&quot;Hello World&quot;</span> <span class="o">);</span>
+<span class="n">contentStream</span><span class="o">.</span><span 
class="na">endText</span><span class="o">();</span>
+
+<span class="c1">// Make sure that the content stream is closed:</span>
+<span class="n">contentStream</span><span class="o">.</span><span 
class="na">close</span><span class="o">();</span>
+
+<span class="c1">// Save the results and ensure that the document is properly 
closed:</span>
+<span class="n">document</span><span class="o">.</span><span 
class="na">save</span><span class="o">(</span> <span class="s">&quot;Hello 
World.pdf&quot;</span><span class="o">);</span>
+<span class="n">document</span><span class="o">.</span><span 
class="na">close</span><span class="o">();</span>
+</pre></div> 
+            </div>
+        </div>
+    </div>
+
+    <footer class="footer">
+        <div class="container"
+            <div class="row">
+                <div class="span3">
+                    <!-- nothing in here on purpose -->
+                </div>
+                <div class="span9">
+                    <p>Copyright © 2009&ndash;2015 <a 
href="http://www.apache.org/";>The Apache Software Foundation</a>, Licensed 
under the <a href="http://www.apache.org/licenses/LICENSE-2.0";>Apache License, 
Version 2.0</a>.
+                        <br/>Apache PDFBox, PDFBox, Apache, the Apache feather 
logo and the Apache PDFBox project logos are trademarks of The Apache Software 
Foundation.</p>
+                </div>
+            </div>
+        </div>
+    </footer>
+
+</body>
+
+</html>

Added: websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfacreation.html
==============================================================================
--- websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfacreation.html (added)
+++ websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfacreation.html Mon 
Jan  5 20:30:08 2015
@@ -0,0 +1,184 @@
+<!DOCTYPE html>
+<html lang="en">
+
+<!--
+     
+     Licensed to the Apache Software Foundation (ASF) under one or more
+     contributor license agreements.  See the NOTICE file distributed with
+     this work for additional information regarding copyright ownership.
+     The ASF licenses this file to You under the Apache License, Version 2.0
+     (the "License"); you may not use this file except in compliance with
+     the License.  You may obtain a copy of the License at
+     
+     http://www.apache.org/licenses/LICENSE- 2.0
+     
+     Unless required by applicable law or agreed to in writing, software
+     distributed under the License is distributed on an "AS IS" BASIS,
+     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+     See the License for the specific language governing permissions and
+     limitations under the License.
+     -->
+
+<head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+
+    <title>Apache PDFBox | Create a valid PDF/A document</title>
+
+    <link href="/bootstrap/css/bootstrap.min.css" rel="stylesheet">
+    <link href="/FontAwesome/css/font-awesome.css" rel="stylesheet">
+    <link href="/Iconic/iconic fill/iconic_fill.css" rel="stylesheet">
+    <link href="/css/pygments-github.css" rel="stylesheet">
+    
+    <link href="/css/site.css" rel="stylesheet">
+    
+    
+    
+     
+    
+    
+        <!-- Licensed to the Apache Software Foundation (ASF) under one or 
more contributor license agreements.  See the NOTICE file distributed with this 
work for additional information regarding copyright ownership.  The ASF 
licenses this file to you under the Apache License, Version 2.0 (the 
&quot;License&quot;); you may not use this file except in compliance with the 
License.  You may obtain a copy of the License at . 
http://www.apache.org/licenses/LICENSE-2.0 . Unless required by applicable law 
or agreed to in writing, software distributed under the License is distributed 
on an &quot;AS IS&quot; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 
either express or implied.  See the License for the specific language governing 
permissions and limitations under the License. -->
+    
+    <!-- Twitter Bootstrap and jQuery after this line. -->
+    <script src="//code.jquery.com/jquery-latest.js"></script>
+    <script src="/bootstrap/js/bootstrap.min.js"></script>
+</head>
+
+<body>
+    <nav class="navbar navbar-default navbar-top">
+      <div class="container">
+        <div class="navbar-header">
+          <a href="/index.html">
+            <img class="logo" src="/images/logo-head.gif">
+          </a>
+        </div>
+      </div>
+    </nav>
+    
+    <div class="container">
+        
+        <div class="row">
+            <div class="col-xs-3">
+                
+                <ul class="sidebar">
+                    <li class="sidebar-header">Apache PDFBox</li>
+                    <li><a href="/index.cgi">Overview</a></li>
+                    <li><a href="/download.cgi">Downloads</a></li>
+                    
+                    <li class="sidebar-header">Community</li>
+                    <li><a href="/support.html">Support</a></li>
+                    <li><a href="/mailinglists.html">Mailing Lists</a></li>
+                    <li><a href="/team.html">Project Team</a></li>
+                    
+                    <li class="sidebar-header">Documentation</li>
+                    <li class="sidebar-node">
+                        <a href="#">Trunk</a>
+                        <ul>
+                            <li><a href="/docs/2.0.0-SNAPSHOT/javadocs/">API 
Docs</a></li>
+                        </ul>
+                    </li>
+                    <li class="sidebar-node">
+                        <a href="#">1.8.8</a>
+                        <ul>
+                            <li><a 
href="/1.8/architecture.html">Architecture</a></li>
+                            <li><a 
href="/1.8/dependencies.html">Dependencies</a></li>
+                            <li class="dropdown">
+                                <a class="dropdown-toggle" 
data-toggle="dropdown" href="#">
+                                    Cookbook <b class="caret"></b>
+                                </a>
+                                <ul class="dropdown-menu">
+                                    <li><a 
href="/1.8/cookbook/documentcreation.html">Document Creation</a></li>
+                                    <li><a 
href="/1.8/cookbook/textextraction.html">Text Extraction</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfavalidation.html">PDF/A Validation</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithfonts.html">Working with Fonts</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithmetadata.html">Working with Metadata</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithattachments.html">Working with 
Attachments</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfacreation.html">Creating a PDF/A document</a></li>
+                                </ul>
+                            </li>
+                            <li><a href="/1.8/commandline.html">Command Line 
Tools</a></li>
+                            <li><a href="/docs/1.8.8/javadocs/">API 
Docs</a></li>
+                            <li><a href="/1.8/userguide/faq.html">FAQ</a></li>
+                        </ul>
+                    </li>
+                    
+                    <li class="sidebar-header">Development</li>
+                    <li><a href="/codingconventions.html">Coding 
Conventions</a></li>
+                    <li><a href="/building.html">Building</a></li>
+                    <li><a href="/ideas.html">Ideas</a></li>
+                    <li><a href="/references.html">References</a></li>
+
+                    <li class="sidebar-header">Apache Software Foundation</li>
+                    <li><a href="http://www.apache.org/";>Apache Software 
Foundation</a></li>
+                    <li><a 
href="http://www.apache.org/foundation/thanks.html";>ASF Sponsors</a></li>
+                    <li><a 
href="http://www.apache.org/security/";>Security</a></li>
+                </ul>
+            </div>
+            <div class="col-xs-9">
+                 <h1 id="pdfa-creation">PDF/A Creation</h1>
+<p>The Apache PDFBox API can be used to create a PDF/A File. PDF/A is a PDF 
file with some constraints to ensure its 
+long time conservation. These constraints are described in ISO 19005.</p>
+<p>This small sample shows what should be added during creation of a PDF file 
to transform it in a valid PDF/A 
+document. The current example creates a valid PDF/A-1b document.</p>
+<h2 id="load-all-the-fonts-used-in-document">Load all the fonts used in 
document</h2>
+<p>The PDF/A specification enforces that the fonts used in the document are 
present in the PDF File. You
+have to load them. As an example:</p>
+<div class="codehilite"><pre><span class="n">InputStream</span> <span 
class="n">fontStream</span> <span class="o">=</span> <span 
class="n">CreatePDFA</span><span class="o">.</span><span 
class="na">class</span><span class="o">.</span><span 
class="na">getResourceAsStream</span><span class="o">(</span><span 
class="s">&quot;/org/apache/pdfbox/resources/ttf/ArialMT.ttf&quot;</span><span 
class="o">);</span>
+<span class="n">PDFont</span> <span class="n">font</span> <span 
class="o">=</span> <span class="n">PDTrueTypeFont</span><span 
class="o">.</span><span class="na">loadTTF</span><span class="o">(</span><span 
class="n">doc</span><span class="o">,</span> <span 
class="n">fontStream</span><span class="o">);</span>
+</pre></div>
+
+
+<h2 id="including-xmp-metadata-block">Including XMP metadata block</h2>
+<p>It is imposed to have xmp metadata defined in the PDF. At least, the PDFA 
Schema (giving details on the version
+of PDF/A specification reached by the document) must be present. These lines 
create the xmp metadata for a
+PDF/A-1b document:</p>
+<div class="codehilite"><pre><span class="n">XMPMetadata</span> <span 
class="n">xmp</span> <span class="o">=</span> <span class="k">new</span> <span 
class="n">XMPMetadata</span><span class="o">();</span>
+<span class="n">XMPSchemaPDFAId</span> <span class="n">pdfaid</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">XMPSchemaPDFAId</span><span class="o">(</span><span 
class="n">xmp</span><span class="o">);</span>
+<span class="n">xmp</span><span class="o">.</span><span 
class="na">addSchema</span><span class="o">(</span><span 
class="n">pdfaid</span><span class="o">);</span>
+<span class="n">pdfaid</span><span class="o">.</span><span 
class="na">setConformance</span><span class="o">(</span><span 
class="s">&quot;B&quot;</span><span class="o">);</span>
+<span class="n">pdfaid</span><span class="o">.</span><span 
class="na">setPart</span><span class="o">(</span><span class="mi">1</span><span 
class="o">);</span>
+<span class="n">pdfaid</span><span class="o">.</span><span 
class="na">setAbout</span><span class="o">(</span><span 
class="s">&quot;&quot;</span><span class="o">);</span>
+<span class="n">metadata</span><span class="o">.</span><span 
class="na">importXMPMetadata</span><span class="o">(</span><span 
class="n">xmp</span><span class="o">);</span>
+</pre></div>
+
+
+<h2 id="including-color-profile">Including color profile</h2>
+<p>It is mandatory to include the color profile used by the document. 
Different profiles can be used. This 
+example takes one present in pdfbox:</p>
+<div class="codehilite"><pre><span class="c1">// create output intent</span>
+<span class="n">InputStream</span> <span class="n">colorProfile</span> <span 
class="o">=</span> <span class="n">CreatePDFA</span><span 
class="o">.</span><span class="na">class</span><span class="o">.</span><span 
class="na">getResourceAsStream</span><span class="o">(</span><span 
class="s">&quot;/org/apache/pdfbox/resources/pdfa/sRGB Color Space 
Profile.icm&quot;</span><span class="o">);</span>
+<span class="n">PDOutputIntent</span> <span class="n">oi</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">PDOutputIntent</span><span class="o">(</span><span 
class="n">doc</span><span class="o">,</span> <span 
class="n">colorProfile</span><span class="o">);</span> 
+<span class="n">oi</span><span class="o">.</span><span 
class="na">setInfo</span><span class="o">(</span><span class="s">&quot;sRGB 
IEC61966-2.1&quot;</span><span class="o">);</span> 
+<span class="n">oi</span><span class="o">.</span><span 
class="na">setOutputCondition</span><span class="o">(</span><span 
class="s">&quot;sRGB IEC61966-2.1&quot;</span><span class="o">);</span> 
+<span class="n">oi</span><span class="o">.</span><span 
class="na">setOutputConditionIdentifier</span><span class="o">(</span><span 
class="s">&quot;sRGB IEC61966-2.1&quot;</span><span class="o">);</span> 
+<span class="n">oi</span><span class="o">.</span><span 
class="na">setRegistryName</span><span class="o">(</span><span 
class="s">&quot;http://www.color.org&quot;</span><span class="o">);</span> 
+<span class="n">cat</span><span class="o">.</span><span 
class="na">addOutputIntent</span><span class="o">(</span><span 
class="n">oi</span><span class="o">);</span>
+</pre></div>
+
+
+<h2 id="complete-example">Complete example</h2>
+<p>The complete example can be found in pdfbox-example. The source file is</p>
+<div class="codehilite"><pre><span class="n">src</span><span 
class="o">/</span><span class="n">main</span><span class="o">/</span><span 
class="n">java</span><span class="o">/</span><span class="n">org</span><span 
class="o">/</span><span class="n">apache</span><span class="o">/</span><span 
class="n">pdfbox</span><span class="o">/</span><span 
class="n">examples</span><span class="o">/</span><span 
class="n">pdfa</span><span class="o">/</span><span 
class="n">CreatePDFA</span><span class="p">.</span><span class="n">java</span>
+</pre></div> 
+            </div>
+        </div>
+    </div>
+
+    <footer class="footer">
+        <div class="container"
+            <div class="row">
+                <div class="span3">
+                    <!-- nothing in here on purpose -->
+                </div>
+                <div class="span9">
+                    <p>Copyright © 2009&ndash;2015 <a 
href="http://www.apache.org/";>The Apache Software Foundation</a>, Licensed 
under the <a href="http://www.apache.org/licenses/LICENSE-2.0";>Apache License, 
Version 2.0</a>.
+                        <br/>Apache PDFBox, PDFBox, Apache, the Apache feather 
logo and the Apache PDFBox project logos are trademarks of The Apache Software 
Foundation.</p>
+                </div>
+            </div>
+        </div>
+    </footer>
+
+</body>
+
+</html>

Added: websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfavalidation.html
==============================================================================
--- websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfavalidation.html 
(added)
+++ websites/staging/pdfbox/trunk/content/1.8/cookbook/pdfavalidation.html Mon 
Jan  5 20:30:08 2015
@@ -0,0 +1,240 @@
+<!DOCTYPE html>
+<html lang="en">
+
+<!--
+     
+     Licensed to the Apache Software Foundation (ASF) under one or more
+     contributor license agreements.  See the NOTICE file distributed with
+     this work for additional information regarding copyright ownership.
+     The ASF licenses this file to You under the Apache License, Version 2.0
+     (the "License"); you may not use this file except in compliance with
+     the License.  You may obtain a copy of the License at
+     
+     http://www.apache.org/licenses/LICENSE- 2.0
+     
+     Unless required by applicable law or agreed to in writing, software
+     distributed under the License is distributed on an "AS IS" BASIS,
+     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+     See the License for the specific language governing permissions and
+     limitations under the License.
+     -->
+
+<head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+
+    <title>Apache PDFBox | Cookbook - PDF/A Validation</title>
+
+    <link href="/bootstrap/css/bootstrap.min.css" rel="stylesheet">
+    <link href="/FontAwesome/css/font-awesome.css" rel="stylesheet">
+    <link href="/Iconic/iconic fill/iconic_fill.css" rel="stylesheet">
+    <link href="/css/pygments-github.css" rel="stylesheet">
+    
+    <link href="/css/site.css" rel="stylesheet">
+    
+    
+    
+     
+    
+    
+    <!-- Twitter Bootstrap and jQuery after this line. -->
+    <script src="//code.jquery.com/jquery-latest.js"></script>
+    <script src="/bootstrap/js/bootstrap.min.js"></script>
+</head>
+
+<body>
+    <nav class="navbar navbar-default navbar-top">
+      <div class="container">
+        <div class="navbar-header">
+          <a href="/index.html">
+            <img class="logo" src="/images/logo-head.gif">
+          </a>
+        </div>
+      </div>
+    </nav>
+    
+    <div class="container">
+        
+        <div class="row">
+            <div class="col-xs-3">
+                
+                <ul class="sidebar">
+                    <li class="sidebar-header">Apache PDFBox</li>
+                    <li><a href="/index.cgi">Overview</a></li>
+                    <li><a href="/download.cgi">Downloads</a></li>
+                    
+                    <li class="sidebar-header">Community</li>
+                    <li><a href="/support.html">Support</a></li>
+                    <li><a href="/mailinglists.html">Mailing Lists</a></li>
+                    <li><a href="/team.html">Project Team</a></li>
+                    
+                    <li class="sidebar-header">Documentation</li>
+                    <li class="sidebar-node">
+                        <a href="#">Trunk</a>
+                        <ul>
+                            <li><a href="/docs/2.0.0-SNAPSHOT/javadocs/">API 
Docs</a></li>
+                        </ul>
+                    </li>
+                    <li class="sidebar-node">
+                        <a href="#">1.8.8</a>
+                        <ul>
+                            <li><a 
href="/1.8/architecture.html">Architecture</a></li>
+                            <li><a 
href="/1.8/dependencies.html">Dependencies</a></li>
+                            <li class="dropdown">
+                                <a class="dropdown-toggle" 
data-toggle="dropdown" href="#">
+                                    Cookbook <b class="caret"></b>
+                                </a>
+                                <ul class="dropdown-menu">
+                                    <li><a 
href="/1.8/cookbook/documentcreation.html">Document Creation</a></li>
+                                    <li><a 
href="/1.8/cookbook/textextraction.html">Text Extraction</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfavalidation.html">PDF/A Validation</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithfonts.html">Working with Fonts</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithmetadata.html">Working with Metadata</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithattachments.html">Working with 
Attachments</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfacreation.html">Creating a PDF/A document</a></li>
+                                </ul>
+                            </li>
+                            <li><a href="/1.8/commandline.html">Command Line 
Tools</a></li>
+                            <li><a href="/docs/1.8.8/javadocs/">API 
Docs</a></li>
+                            <li><a href="/1.8/userguide/faq.html">FAQ</a></li>
+                        </ul>
+                    </li>
+                    
+                    <li class="sidebar-header">Development</li>
+                    <li><a href="/codingconventions.html">Coding 
Conventions</a></li>
+                    <li><a href="/building.html">Building</a></li>
+                    <li><a href="/ideas.html">Ideas</a></li>
+                    <li><a href="/references.html">References</a></li>
+
+                    <li class="sidebar-header">Apache Software Foundation</li>
+                    <li><a href="http://www.apache.org/";>Apache Software 
Foundation</a></li>
+                    <li><a 
href="http://www.apache.org/foundation/thanks.html";>ASF Sponsors</a></li>
+                    <li><a 
href="http://www.apache.org/security/";>Security</a></li>
+                </ul>
+            </div>
+            <div class="col-xs-9">
+                 <h1 id="pdfa-validation">PDF/A Validation</h1>
+<p>The Apache Preflight library is a Java tool that implements a parser 
compliant with the ISO-19005 specification (aka PDF/A-1).
+Check Compliance with PDF/A-1b</p>
+<p>This small sample shows how to check the compliance of a file with the 
PDF/A-1b specification.</p>
+<div class="codehilite"><pre><span class="n">ValidationResult</span> <span 
class="n">result</span> <span class="o">=</span> <span 
class="kc">null</span><span class="o">;</span>
+
+<span class="n">FileDataSource</span> <span class="n">fd</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">FileDataSource</span><span class="o">(</span><span 
class="n">args</span><span class="o">[</span><span class="mi">0</span><span 
class="o">]);</span>
+<span class="n">PreflightParser</span> <span class="n">parser</span> <span 
class="o">=</span> <span class="k">new</span> <span 
class="n">PreflightParser</span><span class="o">(</span><span 
class="n">fd</span><span class="o">);</span>
+<span class="k">try</span>
+<span class="o">{</span>
+
+    <span class="cm">/* Parse the PDF file with PreflightParser that inherits 
from the NonSequentialParser.</span>
+<span class="cm">     * Some additional controls are present to check a set of 
PDF/A requirements. </span>
+<span class="cm">     * (Stream length consistency, EOL after some 
Keyword...)</span>
+<span class="cm">     */</span>
+    <span class="n">parser</span><span class="o">.</span><span 
class="na">parse</span><span class="o">();</span>
+
+    <span class="cm">/* Once the syntax validation is done, </span>
+<span class="cm">     * the parser can provide a PreflightDocument </span>
+<span class="cm">     * (that inherits from PDDocument) </span>
+<span class="cm">     * This document process the end of PDF/A 
validation.</span>
+<span class="cm">     */</span>
+    <span class="n">PreflightDocument</span> <span class="n">document</span> 
<span class="o">=</span> <span class="n">parser</span><span 
class="o">.</span><span class="na">getPreflightDocument</span><span 
class="o">();</span>
+    <span class="n">document</span><span class="o">.</span><span 
class="na">validate</span><span class="o">();</span>
+
+    <span class="c1">// Get validation result</span>
+    <span class="n">result</span> <span class="o">=</span> <span 
class="n">document</span><span class="o">.</span><span 
class="na">getResult</span><span class="o">();</span>
+    <span class="n">document</span><span class="o">.</span><span 
class="na">close</span><span class="o">();</span>
+
+<span class="o">}</span>
+<span class="k">catch</span> <span class="o">(</span><span 
class="n">SyntaxValidationException</span> <span class="n">e</span><span 
class="o">)</span>
+<span class="o">{</span>
+    <span class="cm">/* the parse method can throw a SyntaxValidationException 
</span>
+<span class="cm">     * if the PDF file can&#39;t be parsed.</span>
+<span class="cm">     * In this case, the exception contains an instance of 
ValidationResult  </span>
+<span class="cm">     */</span>
+    <span class="n">result</span> <span class="o">=</span> <span 
class="n">e</span><span class="o">.</span><span 
class="na">getResult</span><span class="o">();</span>
+<span class="o">}</span>
+
+<span class="c1">// display validation result</span>
+<span class="k">if</span> <span class="o">(</span><span 
class="n">result</span><span class="o">.</span><span 
class="na">isValid</span><span class="o">())</span>
+<span class="o">{</span>
+    <span class="n">System</span><span class="o">.</span><span 
class="na">out</span><span class="o">.</span><span 
class="na">println</span><span class="o">(</span><span class="s">&quot;The file 
&quot;</span> <span class="o">+</span> <span class="n">args</span><span 
class="o">[</span><span class="mi">0</span><span class="o">]</span> <span 
class="o">+</span> <span class="s">&quot; is a valid PDF/A-1b 
file&quot;</span><span class="o">);</span>
+<span class="o">}</span>
+<span class="k">else</span>
+<span class="o">{</span>
+    <span class="n">System</span><span class="o">.</span><span 
class="na">out</span><span class="o">.</span><span 
class="na">println</span><span class="o">(</span><span class="s">&quot;The 
file&quot;</span> <span class="o">+</span> <span class="n">args</span><span 
class="o">[</span><span class="mi">0</span><span class="o">]</span> <span 
class="o">+</span> <span class="s">&quot; is not valid, error(s) 
:&quot;</span><span class="o">);</span>
+    <span class="k">for</span> <span class="o">(</span><span 
class="n">ValidationError</span> <span class="n">error</span> <span 
class="o">:</span> <span class="n">result</span><span class="o">.</span><span 
class="na">getErrorsList</span><span class="o">())</span>
+    <span class="o">{</span>
+        <span class="n">System</span><span class="o">.</span><span 
class="na">out</span><span class="o">.</span><span 
class="na">println</span><span class="o">(</span><span 
class="n">error</span><span class="o">.</span><span 
class="na">getErrorCode</span><span class="o">()</span> <span 
class="o">+</span> <span class="s">&quot; : &quot;</span> <span 
class="o">+</span> <span class="n">error</span><span class="o">.</span><span 
class="na">getDetails</span><span class="o">());</span>
+    <span class="o">}</span>
+<span class="o">}</span>
+</pre></div>
+
+
+<h2 id="categories-of-validation-error">Categories of Validation Error</h2>
+<p>If a validation fails, the ValidationResult object contains all causes of 
the failure.
+In order to help in the failure understanding, all error codes have the 
following form X[.Y[.Z]] where :</p>
+<ul>
+<li>'X' is the category (ex : Font validation error...)</li>
+<li>'Y' represent a subsection of the category (ex : "Font with Glyph 
error")</li>
+<li>'Z' represent the cause of the error (ex : "Font with a missing 
Glyph")</li>
+</ul>
+<p>Category ('Y') and cause ('Z') may be missing according to the difficulty 
to identify the error detail.</p>
+<p>Here after, you can find all Categories (for detailed cause, see constants 
in the PreglihtConstant interface) :</p>
+<table>
+<thead>
+<tr>
+<th>Category</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>1[.y[.z]]</td>
+<td>Syntax Error</td>
+</tr>
+<tr>
+<td>2[.y[.z]]</td>
+<td>Graphic Error</td>
+</tr>
+<tr>
+<td>3[.y[.z]]</td>
+<td>Font Error</td>
+</tr>
+<tr>
+<td>4[.y[.z]]</td>
+<td>Transparency Error</td>
+</tr>
+<tr>
+<td>5[.y[.z]]</td>
+<td>Annotation Error</td>
+</tr>
+<tr>
+<td>6[.y[.z]]</td>
+<td>Action Error</td>
+</tr>
+<tr>
+<td>7[.y[.z]]</td>
+<td>Metadata Error</td>
+</tr>
+</tbody>
+</table> 
+            </div>
+        </div>
+    </div>
+
+    <footer class="footer">
+        <div class="container"
+            <div class="row">
+                <div class="span3">
+                    <!-- nothing in here on purpose -->
+                </div>
+                <div class="span9">
+                    <p>Copyright © 2009&ndash;2015 <a 
href="http://www.apache.org/";>The Apache Software Foundation</a>, Licensed 
under the <a href="http://www.apache.org/licenses/LICENSE-2.0";>Apache License, 
Version 2.0</a>.
+                        <br/>Apache PDFBox, PDFBox, Apache, the Apache feather 
logo and the Apache PDFBox project logos are trademarks of The Apache Software 
Foundation.</p>
+                </div>
+            </div>
+        </div>
+    </footer>
+
+</body>
+
+</html>

Added: websites/staging/pdfbox/trunk/content/1.8/cookbook/textextraction.html
==============================================================================
--- websites/staging/pdfbox/trunk/content/1.8/cookbook/textextraction.html 
(added)
+++ websites/staging/pdfbox/trunk/content/1.8/cookbook/textextraction.html Mon 
Jan  5 20:30:08 2015
@@ -0,0 +1,250 @@
+<!DOCTYPE html>
+<html lang="en">
+
+<!--
+     
+     Licensed to the Apache Software Foundation (ASF) under one or more
+     contributor license agreements.  See the NOTICE file distributed with
+     this work for additional information regarding copyright ownership.
+     The ASF licenses this file to You under the Apache License, Version 2.0
+     (the "License"); you may not use this file except in compliance with
+     the License.  You may obtain a copy of the License at
+     
+     http://www.apache.org/licenses/LICENSE- 2.0
+     
+     Unless required by applicable law or agreed to in writing, software
+     distributed under the License is distributed on an "AS IS" BASIS,
+     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+     See the License for the specific language governing permissions and
+     limitations under the License.
+     -->
+
+<head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+
+    <title>Apache PDFBox | Cookbook - Textextraction</title>
+
+    <link href="/bootstrap/css/bootstrap.min.css" rel="stylesheet">
+    <link href="/FontAwesome/css/font-awesome.css" rel="stylesheet">
+    <link href="/Iconic/iconic fill/iconic_fill.css" rel="stylesheet">
+    <link href="/css/pygments-github.css" rel="stylesheet">
+    
+    <link href="/css/site.css" rel="stylesheet">
+    
+    
+    
+     
+    
+    
+    <!-- Twitter Bootstrap and jQuery after this line. -->
+    <script src="//code.jquery.com/jquery-latest.js"></script>
+    <script src="/bootstrap/js/bootstrap.min.js"></script>
+</head>
+
+<body>
+    <nav class="navbar navbar-default navbar-top">
+      <div class="container">
+        <div class="navbar-header">
+          <a href="/index.html">
+            <img class="logo" src="/images/logo-head.gif">
+          </a>
+        </div>
+      </div>
+    </nav>
+    
+    <div class="container">
+        
+        <div class="row">
+            <div class="col-xs-3">
+                
+                <ul class="sidebar">
+                    <li class="sidebar-header">Apache PDFBox</li>
+                    <li><a href="/index.cgi">Overview</a></li>
+                    <li><a href="/download.cgi">Downloads</a></li>
+                    
+                    <li class="sidebar-header">Community</li>
+                    <li><a href="/support.html">Support</a></li>
+                    <li><a href="/mailinglists.html">Mailing Lists</a></li>
+                    <li><a href="/team.html">Project Team</a></li>
+                    
+                    <li class="sidebar-header">Documentation</li>
+                    <li class="sidebar-node">
+                        <a href="#">Trunk</a>
+                        <ul>
+                            <li><a href="/docs/2.0.0-SNAPSHOT/javadocs/">API 
Docs</a></li>
+                        </ul>
+                    </li>
+                    <li class="sidebar-node">
+                        <a href="#">1.8.8</a>
+                        <ul>
+                            <li><a 
href="/1.8/architecture.html">Architecture</a></li>
+                            <li><a 
href="/1.8/dependencies.html">Dependencies</a></li>
+                            <li class="dropdown">
+                                <a class="dropdown-toggle" 
data-toggle="dropdown" href="#">
+                                    Cookbook <b class="caret"></b>
+                                </a>
+                                <ul class="dropdown-menu">
+                                    <li><a 
href="/1.8/cookbook/documentcreation.html">Document Creation</a></li>
+                                    <li><a 
href="/1.8/cookbook/textextraction.html">Text Extraction</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfavalidation.html">PDF/A Validation</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithfonts.html">Working with Fonts</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithmetadata.html">Working with Metadata</a></li>
+                                    <li><a 
href="/1.8/cookbook/workingwithattachments.html">Working with 
Attachments</a></li>
+                                    <li><a 
href="/1.8/cookbook/pdfacreation.html">Creating a PDF/A document</a></li>
+                                </ul>
+                            </li>
+                            <li><a href="/1.8/commandline.html">Command Line 
Tools</a></li>
+                            <li><a href="/docs/1.8.8/javadocs/">API 
Docs</a></li>
+                            <li><a href="/1.8/userguide/faq.html">FAQ</a></li>
+                        </ul>
+                    </li>
+                    
+                    <li class="sidebar-header">Development</li>
+                    <li><a href="/codingconventions.html">Coding 
Conventions</a></li>
+                    <li><a href="/building.html">Building</a></li>
+                    <li><a href="/ideas.html">Ideas</a></li>
+                    <li><a href="/references.html">References</a></li>
+
+                    <li class="sidebar-header">Apache Software Foundation</li>
+                    <li><a href="http://www.apache.org/";>Apache Software 
Foundation</a></li>
+                    <li><a 
href="http://www.apache.org/foundation/thanks.html";>ASF Sponsors</a></li>
+                    <li><a 
href="http://www.apache.org/security/";>Security</a></li>
+                </ul>
+            </div>
+            <div class="col-xs-9">
+                 <h1 id="textextraction">Textextraction</h1>
+<h2 id="extracting-text">Extracting Text</h2>
+<p>See class:org.apache.pdfbox.util.PDFTextStripper<br />
+See class:org.apache.pdfbox.searchengine.lucene.LucenePDFDocument<br />
+See command line app:ExtractText  </p>
+<p>One of the main features of PDFBox is its ability to quickly and accurately 
extract text 
+from a variety of PDF documents. This functionality is encapsulated in the 
+org.apache.pdfbox.util.PDFTextStripper and can be easily executed on the 
command line with 
+org.apache.pdfbox.ExtractText.</p>
+<h2 id="lucene-integration">Lucene Integration</h2>
+<p>Lucene is an open source text search library from the Apache Jakarta 
Project. In order for
+Lucene to be able to index a PDF document it must first be converted to text. 
PDFBox provides 
+a simple approach for adding PDF documents into a Lucene index.</p>
+<div class="codehilite"><pre><span class="n">Document</span> <span 
class="n">luceneDocument</span> <span class="o">=</span> <span 
class="n">LucenePDFDocument</span><span class="o">.</span><span 
class="na">getDocument</span><span class="o">(</span> <span 
class="o">...</span> <span class="o">);</span>
+</pre></div>
+
+
+<p>Now that you hava a Lucene Document object, you can add it to the Lucene 
index just like 
+you would if it had been created from a text or HTML file. The 
LucenePDFDocument automatically 
+extracts a variety of metadata fields from the PDF to be added to the index, 
the javadoc 
+shows details on those fields. This approach is very simple and should be 
sufficient for 
+most users, if not then you can use some of the advanced text extraction 
techniques 
+described in the next section.</p>
+<h2 id="advanced-text-extraction">Advanced Text Extraction</h2>
+<p>Some applications will have complex text extraction requiments and neither 
the command 
+line application nor the LucenePDFDocument will be able to fulfill those 
requirements. 
+It is possible for users to utilize or extend the PDFTextStripper class to 
meet some of 
+these requirements.</p>
+<h3 id="limiting-the-extracted-text">Limiting The Extracted Text</h3>
+<p>There are several ways that we can limit the text that is extracted during 
the extraction 
+process. The simplest is to specify the range of pages that you want to be 
extracted. 
+For example, to only extract text from the second and third pages of the PDF 
document 
+you could do this:</p>
+<div class="codehilite"><pre><span class="n">PDFTextStripper</span> <span 
class="n">stripper</span> <span class="o">=</span> <span class="k">new</span> 
<span class="n">PDFTextStripper</span><span class="o">();</span>
+<span class="n">stripper</span><span class="o">.</span><span 
class="na">setStartPage</span><span class="o">(</span> <span 
class="mi">2</span> <span class="o">);</span>
+<span class="n">stripper</span><span class="o">.</span><span 
class="na">setEndPage</span><span class="o">(</span> <span class="mi">3</span> 
<span class="o">);</span>
+<span class="n">stripper</span><span class="o">.</span><span 
class="na">writeText</span><span class="o">(</span> <span class="o">...</span> 
<span class="o">);</span>
+</pre></div>
+
+
+<p>NOTE: The startPage and endPage properties of PDFTextStripper are 1 based 
and inclusive.</p>
+<p>If you wanted to start on page 2 and extract to the end of the document 
then you would just
+set the startPage property. By default all pages in the pdf document are 
extracted.</p>
+<p>It is also possible to limit the extracted text to be between two bookmarks 
in the page. 
+If you are not familiar with how to use bookmarks in PDFBox then you should 
review the 
+Bookmarks page. Similar to the startPage/endPage properties, PDFTextStripper 
also has 
+startBookmark/endBookmark properties. There are some caveats to be aware of 
when using this
+feature of the PDFTextStripper. Not all bookmarks point to a page in the 
current PDF document. </p>
+<p>The possible states of a bookmark are:</p>
+<ul>
+<li>null - The property was not set, this is the default.</li>
+<li>Points to page in the PDF - The property was set and points to a valid 
page in the PDF</li>
+<li>Bookmark does not point to anything - The property was set but the 
bookmark does not point to any page</li>
+<li>Bookmark points to external action - The property was set, but it points 
to a page in a different PDF or performs an action when activated</li>
+</ul>
+<p>The table below will describe how PDFBox behaves in the various 
scenarios:</p>
+<table>
+<thead>
+<tr>
+<th>Start Bookmark</th>
+<th>End Bookmark</th>
+<th>Result</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>null</td>
+<td>null</td>
+<td>This is the default, the properties have no effect on the text 
extraction.</td>
+</tr>
+<tr>
+<td>Points to a page in the PDF</td>
+<td>null</td>
+<td>Text extraction will begin on the page that this bookmark points to and go 
until the end of the document.</td>
+</tr>
+<tr>
+<td>null</td>
+<td>Points to a page in the PDF</td>
+<td>Text extraction will begin on the first page and stop at the end of the 
page that this bookmark points to.</td>
+</tr>
+<tr>
+<td>Bookmark does not point to anything</td>
+<td>null</td>
+<td>Because the PDFTextStripper cannot determine a start page based on the 
bookmark, it will start on the first page and go until the end of the 
document.</td>
+</tr>
+<tr>
+<td>null</td>
+<td>Bookmark does not point to anything</td>
+<td>Because the PDFTextStripper cannot determine a end page based on the 
bookmark, it will start on the first page and go until the end of the 
document.</td>
+</tr>
+<tr>
+<td>Bookmark does not point to anything</td>
+<td>Bookmark does not point to anything</td>
+<td>This is a special case! If the startBookmark and endBookmark are exactly 
the same then no text will be extracted. If they are different then it is not 
possible for the PDFTextStripper to determine that pages so it will include the 
entire document.</td>
+</tr>
+<tr>
+<td>Bookmark points to external action</td>
+<td>Bookmark points to external action</td>
+<td>If either the startBookmark or the endBookmark refer to an external page 
or execute an action then an OutlineNotLocalException will be thrown to 
indicate to the user that the bookmark is not valid.</td>
+</tr>
+</tbody>
+</table>
+<p>NOTE: PDFTextStripper will check both the startPage/endPage and the 
startBookmark/endBookmark to determine if text should be extracted from the 
current page.</p>
+<h3 id="external-glyph-list">External Glyph List</h3>
+<p>Some PDF files need to map between glyph names and Unicode values during 
text extraction. 
+PDFBox comes with an Adobe Glyph List, but you may encounter files with glyph 
names that 
+are not in that map. To use your own glyphlist file, supply the file name to 
the <code>glyphlist_ext</code> JVM property.</p>
+<h3 id="right-to-left-text">Right to Left Text</h3>
+<p>Extracting text in languages whose text goes from right to left (such as 
Arabic and Hebrew)
+in PDF files can result in text that is backwards. PDFBox can normalize and 
reverse the text
+if the ICU4J jar file has been placed on the classpath (it is an optional 
dependency). 
+Note that you should also enable sorting with either 
org.apache.pdfbox.util.PDFTextStripper 
+or org.apache.pdfbox.ExtractText to ensure accurate output.</p> 
+            </div>
+        </div>
+    </div>
+
+    <footer class="footer">
+        <div class="container"
+            <div class="row">
+                <div class="span3">
+                    <!-- nothing in here on purpose -->
+                </div>
+                <div class="span9">
+                    <p>Copyright © 2009&ndash;2015 <a 
href="http://www.apache.org/";>The Apache Software Foundation</a>, Licensed 
under the <a href="http://www.apache.org/licenses/LICENSE-2.0";>Apache License, 
Version 2.0</a>.
+                        <br/>Apache PDFBox, PDFBox, Apache, the Apache feather 
logo and the Apache PDFBox project logos are trademarks of The Apache Software 
Foundation.</p>
+                </div>
+            </div>
+        </div>
+    </footer>
+
+</body>
+
+</html>


Reply via email to