[2/3] pdfbox-docs git commit: PDFBOX-3040: use .md for markdown files

msahyoun Fri, 30 Oct 2015 08:29:45 -0700

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/1.8/dependencies.md
----------------------------------------------------------------------
diff --git a/content/1.8/dependencies.md b/content/1.8/dependencies.md
new file mode 100644
index 0000000..da3174f
--- /dev/null
+++ b/content/1.8/dependencies.md
@@ -0,0 +1,96 @@
+---
+layout: default
+title:  Dependencies
+---
+
+# Dependencies
+
+PDFBox consists of a three related components and depends on a few external 
libraries. This page describes what these libraries are and how to include them 
in your application.
+
+## Core components
+
+<p class="alert alert-info">These components are needed during runtime, 
development and testing dependent on the details below.</p>
+
+The three PDFBox components are named ```pdfbox```, ```fontbox``` and 
```jempbox```. The Maven groupId of all PDFBox components is org.apache.pdfbox.
+
+### Minimum Requirement
+
+- Java 1.5
+- [commons-logging](http://commons.apache.org/logging/)
+
+The main PDFBox component, pdfbox, has a hard dependency on the 
[commons-logging](http://commons.apache.org/logging/) library.
+Commons Logging is a generic wrapper around different logging frameworks, so 
you'll either need to also use a logging library like 
[log4j](http://logging.apache.org/log4j/)
+or let commons-logging fall back to the standard [java.util.logging 
API](http://java.sun.com/j2se/1.4.2/docs/guide/util/logging/overview.html)
+included in the Java platform.
+
+### Font Handling
+For font handling the fontbox component is needed.
+
+### XMP Metadata 
+To support XMP metadata the jembox component is needed.
+
+To add the pdfbox, fontbox, jempbox and commons-logging jars to your 
application, the easiest thing is to declare the Maven dependency shown below. 
This gives you the main
+pdfbox library directly and the other required jars as transitive dependencies.
+
+    <dependency>
+      <groupId>org.apache.pdfbox</groupId>
+      <artifactId>pdfbox</artifactId>
+      <version>...</version>
+    </dependency>
+
+Set the version field to the latest stable PDFBox version.
+
+## Optional dependencies
+
+Some features in PDFBox depend on optional external libraries. You can enable 
these features simply by including the required libraries in the classpath of 
your application.
+
+### Extented Image Format Support
+
+To support JBIG2 and writing TIFF images additional libraries are needed. 
+
+<p class="alert alert-warning">The image plugins described below are not part 
of the PDFBox distribution because of incompatible licensing terms. Please make 
sure to check if the licensing terms are compatible to your usage.</p>
+
+For **JBIG2** support a Java ImageIO Plugin such as the [Levigo 
Plugin](https://github.com/levigo/jbig2-imageio) or [JBIG2-Image-Decoder
+](https://github.com/Borisvl/JBIG2-Image-Decoder) will be needed. 
+
+To write **TIFF** images a JAI ImageIO Core library will be needed. 
+
+#### PDF Encryption and Signing
+The most notable such optional feature is support for PDF encryption. Instead 
of implementing its own encryption algorithms, PDFBox uses libraries from the 
+[Legion of the Bouncy Castle](http://www.bouncycastle.org/). Both the bcprov 
and bcmail libraries are needed and can be included using the Maven 
dependencies shown below.
+
+    <dependency>
+      <groupId>org.bouncycastle</groupId>
+      <artifactId>bcprov-jdk15</artifactId>
+      <version>1.44</version>
+    </dependency>
+    <dependency>
+      <groupId>org.bouncycastle</groupId>
+      <artifactId>bcmail-jdk15</artifactId>
+      <version>1.44</version>
+    </dependency>
+ 
+<br/>
+
+#### Support for bidirectional languages
+Another important optional feature is support for bidirectional languages like 
Arabic. PDFBox uses the ICU4J library from the 
+[International Components for Unicode](http://site.icu-project.org/) (ICU) 
project to support such languages in PDF documents. To add the ICU4J jar to 
your project, 
+use the following Maven dependency.
+
+    <dependency>
+      <groupId>com.ibm.icu</groupId>
+      <artifactId>icu4j</artifactId>
+      <version>3.8</version>
+    </dependency>
+
+PDFBox also contains extra support for use with the 
[Lucene](http://lucene.apache.org/) and [Ant](http://ant.apache.org/) projects. 
Since in these cases PDFBox is just an
+add-on feature to these projects, you should first set up your application to 
use Lucene or Ant and then add PDFBox support as described on this page.
+
+## Dependencies for Ant builds
+
+The above instructions expect that you're using 
[Maven](http://maven.apache.org/) or another build tool like 
[Ivy](http://ant.apache.org/ivy/) that supports Maven dependencies.
+If you instead use tools like [Ant](http://ant.apache.org/) where you need to 
explicitly include all the required library jars in your application, you'll 
need to do
+something different.
+
+The easiest approach is to run ``mvn dependency:copy-dependencies`` inside the 
pdfbox directory of the latest PDFBox source release. This will copy all the 
required and optional
+libraries discussed above into the pdfbox/target/dependencies directory. You 
can then simply copy all the libraries you need from this directory to your 
application.


http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/1.8/dependencies.mdtext
----------------------------------------------------------------------
diff --git a/content/1.8/dependencies.mdtext b/content/1.8/dependencies.mdtext
deleted file mode 100644
index da3174f..0000000
--- a/content/1.8/dependencies.mdtext
+++ /dev/null
@@ -1,96 +0,0 @@
----
-layout: default
-title:  Dependencies
----
-
-# Dependencies
-
-PDFBox consists of a three related components and depends on a few external 
libraries. This page describes what these libraries are and how to include them 
in your application.
-
-## Core components
-
-<p class="alert alert-info">These components are needed during runtime, 
development and testing dependent on the details below.</p>
-
-The three PDFBox components are named ```pdfbox```, ```fontbox``` and 
```jempbox```. The Maven groupId of all PDFBox components is org.apache.pdfbox.
-
-### Minimum Requirement
-
-- Java 1.5
-- [commons-logging](http://commons.apache.org/logging/)
-
-The main PDFBox component, pdfbox, has a hard dependency on the 
[commons-logging](http://commons.apache.org/logging/) library.
-Commons Logging is a generic wrapper around different logging frameworks, so 
you'll either need to also use a logging library like 
[log4j](http://logging.apache.org/log4j/)
-or let commons-logging fall back to the standard [java.util.logging 
API](http://java.sun.com/j2se/1.4.2/docs/guide/util/logging/overview.html)
-included in the Java platform.
-
-### Font Handling
-For font handling the fontbox component is needed.
-
-### XMP Metadata 
-To support XMP metadata the jembox component is needed.
-
-To add the pdfbox, fontbox, jempbox and commons-logging jars to your 
application, the easiest thing is to declare the Maven dependency shown below. 
This gives you the main
-pdfbox library directly and the other required jars as transitive dependencies.
-
-    <dependency>
-      <groupId>org.apache.pdfbox</groupId>
-      <artifactId>pdfbox</artifactId>
-      <version>...</version>
-    </dependency>
-
-Set the version field to the latest stable PDFBox version.
-
-## Optional dependencies
-
-Some features in PDFBox depend on optional external libraries. You can enable 
these features simply by including the required libraries in the classpath of 
your application.
-
-### Extented Image Format Support
-
-To support JBIG2 and writing TIFF images additional libraries are needed. 
-
-<p class="alert alert-warning">The image plugins described below are not part 
of the PDFBox distribution because of incompatible licensing terms. Please make 
sure to check if the licensing terms are compatible to your usage.</p>
-
-For **JBIG2** support a Java ImageIO Plugin such as the [Levigo 
Plugin](https://github.com/levigo/jbig2-imageio) or [JBIG2-Image-Decoder
-](https://github.com/Borisvl/JBIG2-Image-Decoder) will be needed. 
-
-To write **TIFF** images a JAI ImageIO Core library will be needed. 
-
-#### PDF Encryption and Signing
-The most notable such optional feature is support for PDF encryption. Instead 
of implementing its own encryption algorithms, PDFBox uses libraries from the 
-[Legion of the Bouncy Castle](http://www.bouncycastle.org/). Both the bcprov 
and bcmail libraries are needed and can be included using the Maven 
dependencies shown below.
-
-    <dependency>
-      <groupId>org.bouncycastle</groupId>
-      <artifactId>bcprov-jdk15</artifactId>
-      <version>1.44</version>
-    </dependency>
-    <dependency>
-      <groupId>org.bouncycastle</groupId>
-      <artifactId>bcmail-jdk15</artifactId>
-      <version>1.44</version>
-    </dependency>
- 
-<br/>
-
-#### Support for bidirectional languages
-Another important optional feature is support for bidirectional languages like 
Arabic. PDFBox uses the ICU4J library from the 
-[International Components for Unicode](http://site.icu-project.org/) (ICU) 
project to support such languages in PDF documents. To add the ICU4J jar to 
your project, 
-use the following Maven dependency.
-
-    <dependency>
-      <groupId>com.ibm.icu</groupId>
-      <artifactId>icu4j</artifactId>
-      <version>3.8</version>
-    </dependency>
-
-PDFBox also contains extra support for use with the 
[Lucene](http://lucene.apache.org/) and [Ant](http://ant.apache.org/) projects. 
Since in these cases PDFBox is just an
-add-on feature to these projects, you should first set up your application to 
use Lucene or Ant and then add PDFBox support as described on this page.
-
-## Dependencies for Ant builds
-
-The above instructions expect that you're using 
[Maven](http://maven.apache.org/) or another build tool like 
[Ivy](http://ant.apache.org/ivy/) that supports Maven dependencies.
-If you instead use tools like [Ant](http://ant.apache.org/) where you need to 
explicitly include all the required library jars in your application, you'll 
need to do
-something different.
-
-The easiest approach is to run ``mvn dependency:copy-dependencies`` inside the 
pdfbox directory of the latest PDFBox source release. This will copy all the 
required and optional
-libraries discussed above into the pdfbox/target/dependencies directory. You 
can then simply copy all the libraries you need from this directory to your 
application.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/1.8/faq.md
----------------------------------------------------------------------
diff --git a/content/1.8/faq.md b/content/1.8/faq.md
new file mode 100644
index 0000000..018af6d
--- /dev/null
+++ b/content/1.8/faq.md
@@ -0,0 +1,143 @@
+---
+layout: default
+title:  Frequently Asked Questions (FAQ)
+Notice:    Licensed to the Apache Software Foundation (ASF) under one
+           or more contributor license agreements.  See the NOTICE file
+           distributed with this work for additional information
+           regarding copyright ownership.  The ASF licenses this file
+           to you under the Apache License, Version 2.0 (the
+           "License"); you may not use this file except in compliance
+           with the License.  You may obtain a copy of the License at
+           .
+             http://www.apache.org/licenses/LICENSE-2.0
+           .
+           Unless required by applicable law or agreed to in writing,
+           software distributed under the License is distributed on an
+           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+           KIND, either express or implied.  See the License for the
+           specific language governing permissions and limitations
+           under the License.
+---
+
+# Frequently asked questions
+
+### General Questions
+
+ - [I am getting the below Log4J warning message, how do I remove it?](#log4j)
+ - [Is PDFBox thread safe?](#threadsafe)
+ - [Why do I get a "Warning: You did not close the PDF Document"?](#notclosed)
+
+### Text Extraction
+
+ - [How come I am not getting any text from the PDF document?](#notext)
+ - [How come I am getting gibberish(G38G43G36G51G5) when extracting 
text?](#gibberish)
+ - [What does "java.io.IOException: Can't handle font width" mean?](#fontwidth)
+ - [Why do I get "You do not have permission to extract text" on some 
documents?](#permission)
+ - [Can't we just extract the text without parsing the whole document or 
extract text as it is parsed?](#partially)
+
+## General Questions
+
+<a name="log4j"></a>
+### I am getting the below Log4J warning message, how do I remove it? ###
+
+```java
+log4j:WARN No appenders could be found for logger 
(org.apache.pdfbox.util.ResourceLoader).
+log4j:WARN Please initialize the log4j system properly.
+```
+
+This message means that you need to configure the log4j logging system.
+See the [log4j documentation](http://logging.apache.org/log4j/1.2/manual.html) 
for more information.
+
+PDFBox comes with a sample log4j configuration file.  To use it you set a 
system property like this
+
+```java
+java -Dlog4j.configuration=log4j.xml org.apache.pdfbox.ExtractText <PDF-file> 
<output-text-file>
+```
+
+If this is not working for you then you may have to specify the log4j config 
file using a URL path, like this:
+
+```java
+log4j.configuration=file:///<path to config file>
+```
+
+Please see 
[this](https://sourceforge.net/forum/forum.php?thread_id=1254229&amp;forum_id=267205)
 forum thread 
+for more information.
+
+<a name="threadsafe"></a>
+### Is PDFBox thread safe? ###
+
+No! Only one thread may access a single document at a time. You can have 
multiple threads
+each accessing their own PDDocument object.
+
+<a name="notclosed"></a>
+### Why do I get a "Warning: You did not close the PDF Document"? ###
+
+You need to call close() on the PDDocument inside the finally block, if you
+don't then the document will not be closed properly.  Also, you must close all
+PDDocument objects that get created.  The following code creates **two**
+PDDocument objects; one from the "new PDDocument()" and the second by the load 
method.
+
+```java
+PDDocument doc = new PDDocument();
+try
+{
+   doc = PDDocument.load( "my.pdf" );
+}
+finally
+{
+   if( doc != null )
+   {
+      doc.close();
+   }
+}
+```
+
+## Text Extraction
+
+<a name="notext"></a>
+### How come I am not getting any text from the PDF document? ###
+
+Text extraction from a pdf document is a complicated task and there are many 
factors
+involved that effect the possibility and accuracy of text extraction.  It 
would be helpful
+to the PDFBox team if you could try a couple things.
+
+ - Open the PDF in Acrobat and try to extract text from there.  If Acrobat can 
extract text then PDFBox 
+should be able to as well and it is a bug if it cannot.  If Acrobat cannot 
extract text then PDFBox 'probably' cannot either.
+ - It might really be an image instead of text.  Some PDF documents are just 
images that have been scanned in.
+You can tell by using the selection tool in Acrobat, if you can't select any 
text then it is probably an image.
+
+<a name="gibberish"></a>
+### How come I am getting gibberish(G38G43G36G51G5) when extracting text? ###
+
+This is because the characters in a PDF document can use a custom encoding
+instead of unicode or ASCII.  When you see gibberish text then it
+probably means that a meaningless internal encoding is being used.  The
+only way to access the text is to use OCR.  This may be a future
+enhancement.
+
+<a name="fontwidth"></a>
+### What does "java.io.IOException: Can't handle font width" mean? ###
+
+This probably means that the "Resources" directory is not in your classpath. 
The
+Resources directory is included in the PDFBox jar so this is only a problem if 
you
+are building PDFBox yourself and not using the binary.
+
+<a name="permission"></a>
+### Why do I get "You do not have permission to extract text" on some 
documents? ###
+
+PDF documents have certain security permissions that can be applied to them 
and two 
+passwords associated with them, a user password and a master password. If the 
"cannot extract text"
+permission bit is set then you need to decrypt the document with the master 
password in order
+to extract the text.
+
+<a name="partially"></a>
+### Can't we just extract the text without parsing the whole document or 
extract text as it is parsed? ###
+
+Not really, for a couple reasons.
+
+ - If the document is encrypted then you need to parse at least until the 
encryption dictionary before 
+you can decrypt.
+ - Sometimes the PDFont contains vital information needed for text extraction.
+ - Text on a page does not have to be drawn in reading order. For example: if 
the page said "Hello World",
+the pdf could have been written such that "World" gets drawn and then the 
cursor moves to the left and 
+the word "Hello" is drawn.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/1.8/faq.mdtext
----------------------------------------------------------------------
diff --git a/content/1.8/faq.mdtext b/content/1.8/faq.mdtext
deleted file mode 100644
index 018af6d..0000000
--- a/content/1.8/faq.mdtext
+++ /dev/null
@@ -1,143 +0,0 @@
----
-layout: default
-title:  Frequently Asked Questions (FAQ)
-Notice:    Licensed to the Apache Software Foundation (ASF) under one
-           or more contributor license agreements.  See the NOTICE file
-           distributed with this work for additional information
-           regarding copyright ownership.  The ASF licenses this file
-           to you under the Apache License, Version 2.0 (the
-           "License"); you may not use this file except in compliance
-           with the License.  You may obtain a copy of the License at
-           .
-             http://www.apache.org/licenses/LICENSE-2.0
-           .
-           Unless required by applicable law or agreed to in writing,
-           software distributed under the License is distributed on an
-           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-           KIND, either express or implied.  See the License for the
-           specific language governing permissions and limitations
-           under the License.
----
-
-# Frequently asked questions
-
-### General Questions
-
- - [I am getting the below Log4J warning message, how do I remove it?](#log4j)
- - [Is PDFBox thread safe?](#threadsafe)
- - [Why do I get a "Warning: You did not close the PDF Document"?](#notclosed)
-
-### Text Extraction
-
- - [How come I am not getting any text from the PDF document?](#notext)
- - [How come I am getting gibberish(G38G43G36G51G5) when extracting 
text?](#gibberish)
- - [What does "java.io.IOException: Can't handle font width" mean?](#fontwidth)
- - [Why do I get "You do not have permission to extract text" on some 
documents?](#permission)
- - [Can't we just extract the text without parsing the whole document or 
extract text as it is parsed?](#partially)
-
-## General Questions
-
-<a name="log4j"></a>
-### I am getting the below Log4J warning message, how do I remove it? ###
-
-```java
-log4j:WARN No appenders could be found for logger 
(org.apache.pdfbox.util.ResourceLoader).
-log4j:WARN Please initialize the log4j system properly.
-```
-
-This message means that you need to configure the log4j logging system.
-See the [log4j documentation](http://logging.apache.org/log4j/1.2/manual.html) 
for more information.
-
-PDFBox comes with a sample log4j configuration file.  To use it you set a 
system property like this
-
-```java
-java -Dlog4j.configuration=log4j.xml org.apache.pdfbox.ExtractText <PDF-file> 
<output-text-file>
-```
-
-If this is not working for you then you may have to specify the log4j config 
file using a URL path, like this:
-
-```java
-log4j.configuration=file:///<path to config file>
-```
-
-Please see 
[this](https://sourceforge.net/forum/forum.php?thread_id=1254229&amp;forum_id=267205)
 forum thread 
-for more information.
-
-<a name="threadsafe"></a>
-### Is PDFBox thread safe? ###
-
-No! Only one thread may access a single document at a time. You can have 
multiple threads
-each accessing their own PDDocument object.
-
-<a name="notclosed"></a>
-### Why do I get a "Warning: You did not close the PDF Document"? ###
-
-You need to call close() on the PDDocument inside the finally block, if you
-don't then the document will not be closed properly.  Also, you must close all
-PDDocument objects that get created.  The following code creates **two**
-PDDocument objects; one from the "new PDDocument()" and the second by the load 
method.
-
-```java
-PDDocument doc = new PDDocument();
-try
-{
-   doc = PDDocument.load( "my.pdf" );
-}
-finally
-{
-   if( doc != null )
-   {
-      doc.close();
-   }
-}
-```
-
-## Text Extraction
-
-<a name="notext"></a>
-### How come I am not getting any text from the PDF document? ###
-
-Text extraction from a pdf document is a complicated task and there are many 
factors
-involved that effect the possibility and accuracy of text extraction.  It 
would be helpful
-to the PDFBox team if you could try a couple things.
-
- - Open the PDF in Acrobat and try to extract text from there.  If Acrobat can 
extract text then PDFBox 
-should be able to as well and it is a bug if it cannot.  If Acrobat cannot 
extract text then PDFBox 'probably' cannot either.
- - It might really be an image instead of text.  Some PDF documents are just 
images that have been scanned in.
-You can tell by using the selection tool in Acrobat, if you can't select any 
text then it is probably an image.
-
-<a name="gibberish"></a>
-### How come I am getting gibberish(G38G43G36G51G5) when extracting text? ###
-
-This is because the characters in a PDF document can use a custom encoding
-instead of unicode or ASCII.  When you see gibberish text then it
-probably means that a meaningless internal encoding is being used.  The
-only way to access the text is to use OCR.  This may be a future
-enhancement.
-
-<a name="fontwidth"></a>
-### What does "java.io.IOException: Can't handle font width" mean? ###
-
-This probably means that the "Resources" directory is not in your classpath. 
The
-Resources directory is included in the PDFBox jar so this is only a problem if 
you
-are building PDFBox yourself and not using the binary.
-
-<a name="permission"></a>
-### Why do I get "You do not have permission to extract text" on some 
documents? ###
-
-PDF documents have certain security permissions that can be applied to them 
and two 
-passwords associated with them, a user password and a master password. If the 
"cannot extract text"
-permission bit is set then you need to decrypt the document with the master 
password in order
-to extract the text.
-
-<a name="partially"></a>
-### Can't we just extract the text without parsing the whole document or 
extract text as it is parsed? ###
-
-Not really, for a couple reasons.
-
- - If the document is encrypted then you need to parse at least until the 
encryption dictionary before 
-you can decrypt.
- - Sometimes the PDFont contains vital information needed for text extraction.
- - Text on a page does not have to be drawn in reading order. For example: if 
the page said "Hello World",
-the pdf could have been written such that "World" gets drawn and then the 
cursor moves to the left and 
-the word "Hello" is drawn.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/2.0/dependencies.md
----------------------------------------------------------------------
diff --git a/content/2.0/dependencies.md b/content/2.0/dependencies.md
new file mode 100644
index 0000000..3a212d3
--- /dev/null
+++ b/content/2.0/dependencies.md
@@ -0,0 +1,56 @@
+---
+layout: default
+title:  Dependencies
+---
+
+<p class="alert alert-warning">This is an unreleased development preview and 
may change without notice.</p>
+
+# Dependencies
+
+PDFBox has the following basic dependencies:
+
+- Java 6
+- [commons-logging](http://commons.apache.org/logging/)
+
+Commons Logging is a generic wrapper around different logging frameworks, so 
you'll either need to also use a logging library like 
[log4j](http://logging.apache.org/log4j/)
+or let commons-logging fall back to the standard [java.util.logging 
API](http://java.sun.com/j2se/1.4.2/docs/guide/util/logging/overview.html)
+included in the Java platform.
+
+## Optional components
+
+PDFBox does not ship with all features enabled. Third party compoenets are 
necessary to get full support for certain functionality.
+
+### JAI Image I/O
+
+PDF supports embedded image files, however support for some formats require 
third party libraries which are distributed under terms incompatible with the 
Apache 2.0 license:
+
+- Reading **JBIG2** images: [JBIG2 
ImageIO](https://github.com/levigo/jbig2-imageio) or [JBIG2-Image-Decoder
+](https://github.com/Borisvl/JBIG2-Image-Decoder)
+- Reading **JPEG 2000 (JPX)** images: [JAI Image I/O Tools 
Core](https://java.net/projects/jai-imageio-core)
+- Writing **TIFF** images requires *JAI Image I/O Tools Core* also.
+
+These libraries are optional and will be loaded if present on the classpath, 
otherwise support for these image formats will be disable and a warning will be 
logged when an unsupported image is encountered.
+
+Maven dependencies for these components can be found in 
[parent/pom.xml](https://svn.apache.org/viewvc/pdfbox/trunk/parent/pom.xml?view=markup).
 Please make sure that any third party licenses are suitable for your project.
+
+### Encryption and Signing
+
+Encrypting and sigining PDFs requires the *bcprov* and *bcmail* libraries from 
the [Legion of the Bouncy Castle](http://www.bouncycastle.org/). These can be 
included in your Maven project using the following dependencies:
+
+    <dependency>
+        <groupId>org.bouncycastle</groupId>
+        <artifactId>bcprov-jdk15on</artifactId>
+        <version>1.53</version>
+    </dependency>
+    
+    <dependency>
+        <groupId>org.bouncycastle</groupId>
+        <artifactId>bcmail-jdk15on</artifactId>
+        <version>1.53</version>
+    </dependency>
+
+### Java Cryptography Extension (JCE)
+
+256-bit AES encryption requires a JDK with "unlimited strength" cryptography, 
which requires extra files to be installed. For JDK 7, see [Java Cryptography 
Extension 
(JCE)](http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html).
 If these files are not installed, building PDFBox will throw an exception with 
the following message:
+
+    JCE unlimited strength jurisdiction policy files are not installed

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/2.0/dependencies.mdtext
----------------------------------------------------------------------
diff --git a/content/2.0/dependencies.mdtext b/content/2.0/dependencies.mdtext
deleted file mode 100644
index 3a212d3..0000000
--- a/content/2.0/dependencies.mdtext
+++ /dev/null
@@ -1,56 +0,0 @@
----
-layout: default
-title:  Dependencies
----
-
-<p class="alert alert-warning">This is an unreleased development preview and 
may change without notice.</p>
-
-# Dependencies
-
-PDFBox has the following basic dependencies:
-
-- Java 6
-- [commons-logging](http://commons.apache.org/logging/)
-
-Commons Logging is a generic wrapper around different logging frameworks, so 
you'll either need to also use a logging library like 
[log4j](http://logging.apache.org/log4j/)
-or let commons-logging fall back to the standard [java.util.logging 
API](http://java.sun.com/j2se/1.4.2/docs/guide/util/logging/overview.html)
-included in the Java platform.
-
-## Optional components
-
-PDFBox does not ship with all features enabled. Third party compoenets are 
necessary to get full support for certain functionality.
-
-### JAI Image I/O
-
-PDF supports embedded image files, however support for some formats require 
third party libraries which are distributed under terms incompatible with the 
Apache 2.0 license:
-
-- Reading **JBIG2** images: [JBIG2 
ImageIO](https://github.com/levigo/jbig2-imageio) or [JBIG2-Image-Decoder
-](https://github.com/Borisvl/JBIG2-Image-Decoder)
-- Reading **JPEG 2000 (JPX)** images: [JAI Image I/O Tools 
Core](https://java.net/projects/jai-imageio-core)
-- Writing **TIFF** images requires *JAI Image I/O Tools Core* also.
-
-These libraries are optional and will be loaded if present on the classpath, 
otherwise support for these image formats will be disable and a warning will be 
logged when an unsupported image is encountered.
-
-Maven dependencies for these components can be found in 
[parent/pom.xml](https://svn.apache.org/viewvc/pdfbox/trunk/parent/pom.xml?view=markup).
 Please make sure that any third party licenses are suitable for your project.
-
-### Encryption and Signing
-
-Encrypting and sigining PDFs requires the *bcprov* and *bcmail* libraries from 
the [Legion of the Bouncy Castle](http://www.bouncycastle.org/). These can be 
included in your Maven project using the following dependencies:
-
-    <dependency>
-        <groupId>org.bouncycastle</groupId>
-        <artifactId>bcprov-jdk15on</artifactId>
-        <version>1.53</version>
-    </dependency>
-    
-    <dependency>
-        <groupId>org.bouncycastle</groupId>
-        <artifactId>bcmail-jdk15on</artifactId>
-        <version>1.53</version>
-    </dependency>
-
-### Java Cryptography Extension (JCE)
-
-256-bit AES encryption requires a JDK with "unlimited strength" cryptography, 
which requires extra files to be installed. For JDK 7, see [Java Cryptography 
Extension 
(JCE)](http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html).
 If these files are not installed, building PDFBox will throw an exception with 
the following message:
-
-    JCE unlimited strength jurisdiction policy files are not installed

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/2.0/examples.md
----------------------------------------------------------------------
diff --git a/content/2.0/examples.md b/content/2.0/examples.md
new file mode 100644
index 0000000..cdd8be8
--- /dev/null
+++ b/content/2.0/examples.md
@@ -0,0 +1,9 @@
+---
+layout: default
+title:  Examples
+---
+<p class="alert alert-warning">This is an unreleased development preview and 
may change without notice.</p>
+
+# Examples
+
+This content is under construction. Please look at our 
[examples](https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/)
 directory in SVN.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/2.0/examples.mdtext
----------------------------------------------------------------------
diff --git a/content/2.0/examples.mdtext b/content/2.0/examples.mdtext
deleted file mode 100644
index cdd8be8..0000000
--- a/content/2.0/examples.mdtext
+++ /dev/null
@@ -1,9 +0,0 @@
----
-layout: default
-title:  Examples
----
-<p class="alert alert-warning">This is an unreleased development preview and 
may change without notice.</p>
-
-# Examples
-
-This content is under construction. Please look at our 
[examples](https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/)
 directory in SVN.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/2.0/getting-started.md
----------------------------------------------------------------------
diff --git a/content/2.0/getting-started.md b/content/2.0/getting-started.md
new file mode 100644
index 0000000..a4ecc14
--- /dev/null
+++ b/content/2.0/getting-started.md
@@ -0,0 +1,33 @@
+---
+layout: default
+title:  Getting Started
+---
+
+<p class="alert alert-warning">This is an unreleased development preview and 
may change without notice.</p>
+
+# Getting Started
+
+This content is under construction.
+
+## Maven
+
+To use the latest 2.0 snapshot release from the SVN trunk, you'll need to add 
the following dependency:
+
+    <dependency>
+      <groupId>org.apache.pdfbox</groupId>
+      <artifactId>pdfbox</artifactId>
+      <version>2.0.0-SNAPSHOT</version>
+    </dependency>
+
+You'll also need to add the following repository:
+
+    <repository>
+      <id>ApacheSnapshot</id>
+      <name>Apache Repository</name>
+      <url>https://repository.apache.org/content/groups/snapshots/</url>
+      <snapshots>
+        <enabled>true</enabled>
+      </snapshots>
+    </repository>
+
+Please note that this will use the latest **unstable** development snapshot.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/2.0/getting-started.mdtext
----------------------------------------------------------------------
diff --git a/content/2.0/getting-started.mdtext 
b/content/2.0/getting-started.mdtext
deleted file mode 100644
index a4ecc14..0000000
--- a/content/2.0/getting-started.mdtext
+++ /dev/null
@@ -1,33 +0,0 @@
----
-layout: default
-title:  Getting Started
----
-
-<p class="alert alert-warning">This is an unreleased development preview and 
may change without notice.</p>
-
-# Getting Started
-
-This content is under construction.
-
-## Maven
-
-To use the latest 2.0 snapshot release from the SVN trunk, you'll need to add 
the following dependency:
-
-    <dependency>
-      <groupId>org.apache.pdfbox</groupId>
-      <artifactId>pdfbox</artifactId>
-      <version>2.0.0-SNAPSHOT</version>
-    </dependency>
-
-You'll also need to add the following repository:
-
-    <repository>
-      <id>ApacheSnapshot</id>
-      <name>Apache Repository</name>
-      <url>https://repository.apache.org/content/groups/snapshots/</url>
-      <snapshots>
-        <enabled>true</enabled>
-      </snapshots>
-    </repository>
-
-Please note that this will use the latest **unstable** development snapshot.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/building.md
----------------------------------------------------------------------
diff --git a/content/building.md b/content/building.md
new file mode 100644
index 0000000..bf1e4ef
--- /dev/null
+++ b/content/building.md
@@ -0,0 +1,70 @@
+---
+layout: default
+title:  Building PDFBox
+---
+
+# Building from Source
+
+Building PDFBox from source is only necessary if you're wanting to contribute 
code to the PDFBox project. Most users should use the [binary 
releases](http://pdfbox.apache.org/download.cgi) instead.
+
+## Obtaining the Source
+
+You can obtain the latest source of PDFBox from our [SVN 
repo](http://pdfbox.apache.org/download.cgi) The current trunk is 
v2.0.0-SNAPSHOT. There is a seperate branch for the 1.8.x series. You can fetch 
the latest 2.0 trunk using Subversion:
+
+    svn checkout http://svn.apache.org/repos/asf/pdfbox/trunk/
+    cd trunk
+
+## Build dependencies
+
+### PDFBox 1.8
+
+- JDK 5 or 6
+-  [Maven 2](http://maven.apache.org/)
+
+### PDFBox 2.0
+
+- JDK 6+
+- Java Cryptography Extension (JCE) [see below]
+-  [Maven 2](http://maven.apache.org/)
+
+### Java Cryptography Extension (JCE)
+
+Building PDFBox 2.0 requires a JDK with "unlimited strength" cryptography, 
which requires extra files to be installed. For JDK 7, see [Java Cryptography 
Extension 
(JCE)](http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html).
 If these files are not installed, building PDFBox will fail the following test:
+
+    TestPublicKeyEncryption.setUp:70 JCE unlimited strength jurisdiction 
policy files are not installed
+    
+## Building with Maven
+
+In the root directory of PDFBox:
+
+    mvn clean install
+
+---
+
+## Building with Ant (Deprecated, removed in 2.0.0)
+
+The old Ant build is still available, and can be used especially for
+building .NET binaries with IKVM:
+
+1.  Install [ANT](http://ant.apache.org/). PDFBox currently uses 1.6.2
+    but other versions probably work as well.
+2.  (optional) Setup IKVM, if you want to build the .NET DLL version of
+    PDFBox.
+    1.  [IKVM](http://www.ikvm.net/) binaries
+    2.  In the build.properties, set the ikvm.dir property:\
+         `ikvm.dir=C:\\javalib\\ikvm-12-07-2004\\ikvm`
+
+3.  Run "`ant`" from the root PDFBox directory. This will create the
+    .zip package distribution. See the build file for other ant targets.
+
+NOTE: If you want to run PDFBox from an IDE them you will need to add
+the 'Resources' directory to the project classpath in your IDE.
+
+### Dependencies for Ant Builds
+
+The above instructions expect that you're using 
[Maven](http://maven.apache.org/) or another build tool like 
[Ivy](http://ant.apache.org/ivy/) that supports Maven dependencies.
+If you instead use tools like [Ant](http://ant.apache.org/) where you need to 
explicitly include all the required library jars in your application, you'll 
need to do
+something different.
+
+The easiest approach is to run ``mvn dependency:copy-dependencies`` inside the 
pdfbox directory of the latest PDFBox source release. This will copy all the 
required and optional
+libraries discussed above into the pdfbox/target/dependencies directory. You 
can then simply copy all the libraries you need from this directory to your 
application.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/building.mdtext
----------------------------------------------------------------------
diff --git a/content/building.mdtext b/content/building.mdtext
deleted file mode 100644
index bf1e4ef..0000000
--- a/content/building.mdtext
+++ /dev/null
@@ -1,70 +0,0 @@
----
-layout: default
-title:  Building PDFBox
----
-
-# Building from Source
-
-Building PDFBox from source is only necessary if you're wanting to contribute 
code to the PDFBox project. Most users should use the [binary 
releases](http://pdfbox.apache.org/download.cgi) instead.
-
-## Obtaining the Source
-
-You can obtain the latest source of PDFBox from our [SVN 
repo](http://pdfbox.apache.org/download.cgi) The current trunk is 
v2.0.0-SNAPSHOT. There is a seperate branch for the 1.8.x series. You can fetch 
the latest 2.0 trunk using Subversion:
-
-    svn checkout http://svn.apache.org/repos/asf/pdfbox/trunk/
-    cd trunk
-
-## Build dependencies
-
-### PDFBox 1.8
-
-- JDK 5 or 6
--  [Maven 2](http://maven.apache.org/)
-
-### PDFBox 2.0
-
-- JDK 6+
-- Java Cryptography Extension (JCE) [see below]
--  [Maven 2](http://maven.apache.org/)
-
-### Java Cryptography Extension (JCE)
-
-Building PDFBox 2.0 requires a JDK with "unlimited strength" cryptography, 
which requires extra files to be installed. For JDK 7, see [Java Cryptography 
Extension 
(JCE)](http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html).
 If these files are not installed, building PDFBox will fail the following test:
-
-    TestPublicKeyEncryption.setUp:70 JCE unlimited strength jurisdiction 
policy files are not installed
-    
-## Building with Maven
-
-In the root directory of PDFBox:
-
-    mvn clean install
-
----
-
-## Building with Ant (Deprecated, removed in 2.0.0)
-
-The old Ant build is still available, and can be used especially for
-building .NET binaries with IKVM:
-
-1.  Install [ANT](http://ant.apache.org/). PDFBox currently uses 1.6.2
-    but other versions probably work as well.
-2.  (optional) Setup IKVM, if you want to build the .NET DLL version of
-    PDFBox.
-    1.  [IKVM](http://www.ikvm.net/) binaries
-    2.  In the build.properties, set the ikvm.dir property:\
-         `ikvm.dir=C:\\javalib\\ikvm-12-07-2004\\ikvm`
-
-3.  Run "`ant`" from the root PDFBox directory. This will create the
-    .zip package distribution. See the build file for other ant targets.
-
-NOTE: If you want to run PDFBox from an IDE them you will need to add
-the 'Resources' directory to the project classpath in your IDE.
-
-### Dependencies for Ant Builds
-
-The above instructions expect that you're using 
[Maven](http://maven.apache.org/) or another build tool like 
[Ivy](http://ant.apache.org/ivy/) that supports Maven dependencies.
-If you instead use tools like [Ant](http://ant.apache.org/) where you need to 
explicitly include all the required library jars in your application, you'll 
need to do
-something different.
-
-The easiest approach is to run ``mvn dependency:copy-dependencies`` inside the 
pdfbox directory of the latest PDFBox source release. This will copy all the 
required and optional
-libraries discussed above into the pdfbox/target/dependencies directory. You 
can then simply copy all the libraries you need from this directory to your 
application.

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/codingconventions.md
----------------------------------------------------------------------
diff --git a/content/codingconventions.md b/content/codingconventions.md
new file mode 100644
index 0000000..280b571
--- /dev/null
+++ b/content/codingconventions.md
@@ -0,0 +1,128 @@
+---
+layout: default
+title:  Coding Conventions
+---
+
+# Coding Conventions
+
+Over the years the PDFBox project has come to adopt a number of coding 
conventions. These are not always followed in old code but new code should try 
to follow these rules where possible.
+
+### Formatting
+
+- Braces go on their own line.
+
+- Always use braces with control flow statements.
+
+- No lines longer than 100 characters, including JavaDoc.
+
+- Wrapped lines should use either an indent of 4 or 8 characters or align with 
the expression at the same level on the previous line.
+
+- Wrapped lines should be broken after operators, not before.
+
+- Prefer aligned wrapped lines.
+
+- Prefer aligned wrapped parameter lists.
+
+### Whitespace
+
+- Four spaces for indents, no tabs.
+
+- Do not use spaces around parenthesis.
+
+- Use spaces after control flow keywords.
+
+- Prefer using blank lines to separate logical blocks of code, but do not be 
excessive.
+
+- Prefer not following casts with a blank space.
+
+### Structure
+
+- Do not use package imports (e.g. `import java.util.*`)
+
+- Static fields and methods must appear at the top of a class, before any 
other code.
+
+- Within a class, definitions should be ordered as follows:
+
+    Class (static) variables  
+    Instance variables  
+    Constructors  
+    Methods  
+
+### JavaDoc
+
+- Public and protected methods and fields must have JavaDoc.
+
+- Don't use `@version` tags.
+
+- Don't use `@since` tags.
+
+- Don't include your e-mail address in `@author` tags.
+
+- You may omit `@return` tags for getters as long as you include a summary 
which begins with the word "Returns".
+
+- Private methods do not require JavaDoc but may have partial JavaDoc if it 
adds valuable information.
+
+### Comments
+
+- Only use line comments within code, never block comments.
+
+- Prefer comments on their own line, rather than trailing, unless the latter 
is more readable.
+
+- Prefix line comments by a space `// like this`.
+
+### Variables
+
+- Prefer initializing variables when they are declared, rather than C-style 
declaration before use.
+
+- Always use final fields when possible.
+
+### Control Flow
+
+- Prefer multiple return statements over additional control flow logic.
+
+- Prefer switch statements over multi-clause if-then statements.
+
+### API Design
+
+- Give variables and methods meaningful names. Keep these short but don't use 
abbreviations. Prefer using the same terminology as the PDF spec.
+
+- Prefer final classes and final protected methods for non-final public 
classes, this reduces the surface area of the public API.
+
+- Avoid non-final protected variables in public classes. Prefer protected 
getters over protected variables when protected fields are necessery in public 
classes.
+
+- Minimize the API. Don't make everything public just because you can.
+
+- Don't expose implementation details unless there is a clear need: allowing 
subclassing means that the behaviour of protected methods becomes part of the 
contract of the public AP.
+
+- Avoid unnecesary abstraction. While you're encouraged to avoid brittle 
designs, it's unlikey that an API designed for "future use" will have the 
correct API without any code which actually uses it.
+ 
+### Example
+
+Here's an example of PDFBox's formatting style:
+
+    public class Foo extends Bar
+    {
+        public static void main(String args[])
+        {
+            try
+            {
+                for (int i = 0; i < args.length; i++)
+                {
+                    System.out.println(Integer.parseInt(args[i]));
+                }
+            }
+            catch (NumberFormatException e)
+            {
+                e.printStackTrace();
+            }
+        }
+    }
+
+## Eclipse Formatter
+
+Eclipse users may download this preferences file: pdfbox-eclipse-formatter.xml 
and import this into Eclipse. 
+(Window->Preferences, go to Java->Code Style->Formatter and click "Import...").
+Once you have done this you can reformat your code by using Source->Format 
(Ctrl+Shift+F).
+
+Also note that Eclipse will automatically format your import statements 
appropriately when 
+you invoke Source -> Organize Imports (Ctrl+Shift+O).

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/codingconventions.mdtext
----------------------------------------------------------------------
diff --git a/content/codingconventions.mdtext b/content/codingconventions.mdtext
deleted file mode 100644
index 280b571..0000000
--- a/content/codingconventions.mdtext
+++ /dev/null
@@ -1,128 +0,0 @@
----
-layout: default
-title:  Coding Conventions
----
-
-# Coding Conventions
-
-Over the years the PDFBox project has come to adopt a number of coding 
conventions. These are not always followed in old code but new code should try 
to follow these rules where possible.
-
-### Formatting
-
-- Braces go on their own line.
-
-- Always use braces with control flow statements.
-
-- No lines longer than 100 characters, including JavaDoc.
-
-- Wrapped lines should use either an indent of 4 or 8 characters or align with 
the expression at the same level on the previous line.
-
-- Wrapped lines should be broken after operators, not before.
-
-- Prefer aligned wrapped lines.
-
-- Prefer aligned wrapped parameter lists.
-
-### Whitespace
-
-- Four spaces for indents, no tabs.
-
-- Do not use spaces around parenthesis.
-
-- Use spaces after control flow keywords.
-
-- Prefer using blank lines to separate logical blocks of code, but do not be 
excessive.
-
-- Prefer not following casts with a blank space.
-
-### Structure
-
-- Do not use package imports (e.g. `import java.util.*`)
-
-- Static fields and methods must appear at the top of a class, before any 
other code.
-
-- Within a class, definitions should be ordered as follows:
-
-    Class (static) variables  
-    Instance variables  
-    Constructors  
-    Methods  
-
-### JavaDoc
-
-- Public and protected methods and fields must have JavaDoc.
-
-- Don't use `@version` tags.
-
-- Don't use `@since` tags.
-
-- Don't include your e-mail address in `@author` tags.
-
-- You may omit `@return` tags for getters as long as you include a summary 
which begins with the word "Returns".
-
-- Private methods do not require JavaDoc but may have partial JavaDoc if it 
adds valuable information.
-
-### Comments
-
-- Only use line comments within code, never block comments.
-
-- Prefer comments on their own line, rather than trailing, unless the latter 
is more readable.
-
-- Prefix line comments by a space `// like this`.
-
-### Variables
-
-- Prefer initializing variables when they are declared, rather than C-style 
declaration before use.
-
-- Always use final fields when possible.
-
-### Control Flow
-
-- Prefer multiple return statements over additional control flow logic.
-
-- Prefer switch statements over multi-clause if-then statements.
-
-### API Design
-
-- Give variables and methods meaningful names. Keep these short but don't use 
abbreviations. Prefer using the same terminology as the PDF spec.
-
-- Prefer final classes and final protected methods for non-final public 
classes, this reduces the surface area of the public API.
-
-- Avoid non-final protected variables in public classes. Prefer protected 
getters over protected variables when protected fields are necessery in public 
classes.
-
-- Minimize the API. Don't make everything public just because you can.
-
-- Don't expose implementation details unless there is a clear need: allowing 
subclassing means that the behaviour of protected methods becomes part of the 
contract of the public AP.
-
-- Avoid unnecesary abstraction. While you're encouraged to avoid brittle 
designs, it's unlikey that an API designed for "future use" will have the 
correct API without any code which actually uses it.
- 
-### Example
-
-Here's an example of PDFBox's formatting style:
-
-    public class Foo extends Bar
-    {
-        public static void main(String args[])
-        {
-            try
-            {
-                for (int i = 0; i < args.length; i++)
-                {
-                    System.out.println(Integer.parseInt(args[i]));
-                }
-            }
-            catch (NumberFormatException e)
-            {
-                e.printStackTrace();
-            }
-        }
-    }
-
-## Eclipse Formatter
-
-Eclipse users may download this preferences file: pdfbox-eclipse-formatter.xml 
and import this into Eclipse. 
-(Window->Preferences, go to Java->Code Style->Formatter and click "Import...").
-Once you have done this you can reformat your code by using Source->Format 
(Ctrl+Shift+F).
-
-Also note that Eclipse will automatically format your import statements 
appropriately when 
-you invoke Source -> Organize Imports (Ctrl+Shift+O).

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/errors/403.md
----------------------------------------------------------------------
diff --git a/content/errors/403.md b/content/errors/403.md
new file mode 100644
index 0000000..888f7b8
--- /dev/null
+++ b/content/errors/403.md
@@ -0,0 +1,15 @@
+---
+layout: default
+title:  Forbidden (403)
+---
+# 403
+
+We're sorry, but the page you requested cannot be accessed. 
+
+Maybe you 
+
+* typed the address incorrectly
+* followed a link from another site that pointed to this page.
+
+
+If you came by following a broken link, please report the 
[issue](https://issues.apache.org/jira/browse/pdfbox).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/errors/403.mdtext
----------------------------------------------------------------------
diff --git a/content/errors/403.mdtext b/content/errors/403.mdtext
deleted file mode 100644
index 888f7b8..0000000
--- a/content/errors/403.mdtext
+++ /dev/null
@@ -1,15 +0,0 @@
----
-layout: default
-title:  Forbidden (403)
----
-# 403
-
-We're sorry, but the page you requested cannot be accessed. 
-
-Maybe you 
-
-* typed the address incorrectly
-* followed a link from another site that pointed to this page.
-
-
-If you came by following a broken link, please report the 
[issue](https://issues.apache.org/jira/browse/pdfbox).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/errors/404.md
----------------------------------------------------------------------
diff --git a/content/errors/404.md b/content/errors/404.md
new file mode 100644
index 0000000..e83602d
--- /dev/null
+++ b/content/errors/404.md
@@ -0,0 +1,15 @@
+---
+layout: default
+title:  Page Not Found
+---
+# 404
+
+We're sorry, but the page you requested cannot be found. 
+
+Maybe you 
+
+* typed the address incorrectly
+* followed a link from another site that pointed to this page.
+
+
+If you came by following a broken link, please report the 
[issue](https://issues.apache.org/jira/browse/pdfbox).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/errors/404.mdtext
----------------------------------------------------------------------
diff --git a/content/errors/404.mdtext b/content/errors/404.mdtext
deleted file mode 100644
index e83602d..0000000
--- a/content/errors/404.mdtext
+++ /dev/null
@@ -1,15 +0,0 @@
----
-layout: default
-title:  Page Not Found
----
-# 404
-
-We're sorry, but the page you requested cannot be found. 
-
-Maybe you 
-
-* typed the address incorrectly
-* followed a link from another site that pointed to this page.
-
-
-If you came by following a broken link, please report the 
[issue](https://issues.apache.org/jira/browse/pdfbox).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/ideas.md
----------------------------------------------------------------------
diff --git a/content/ideas.md b/content/ideas.md
new file mode 100644
index 0000000..3090f23
--- /dev/null
+++ b/content/ideas.md
@@ -0,0 +1,88 @@
+---
+layout: default
+title:  Ideas
+---
+
+# Ideas
+
+There are several ideas to enhance PDFBox. These are outlined below together 
with 
+comments and the releases they are planned for as soon as there is agreement 
to do the
+implementation.
+
+## Enhance type safety
+
+Enhance the type safety of PDFBox and add more generic collections and code 
cleanup.
+
+## Remove all deprecated methods
+
+This is an ongoing effort and most/all deprecated methods will be removed in 
PDFBox 2.0.0
+
+## Handle large PDF files
+
+In addition to the PDF parsing pdfbox does not always handle large PDF files 
well as some 
+of the references are implemented as int instead of long
+
+
+## <span class="complete">Switch to Java 1.6</span>
+
+<span class="complete">PDFBox 2.0.0 has Java 6 as minimum requirement.</span>
+
+## <span class="complete">Break PDFBox into modules</span>
+
+<span class="complete">In order to support different use cases and provide a 
minimal toolset PDFBox 2.0.0 should be 
+separated into different modules. This goes inline with rearranging some of 
the code
+e.g. remove AWT from PDDocument.
+</span>
+
+## <span class="complete">Enhance the font rendering</span>
+
+<span class="complete">PDFBox 2.0.0 will render most of the fonts without 
using AWT.</span>
+ 
+## Replace/enhance PDF parsing
+
+<span class="complete">The old "classic" PDF parser in PDFBox is not in line 
with the PDF specification as it parses
+a PDF from top to bottom instead of respecting the XRef information.</span> 
The NonSequentialParser
+enhanced that situation but there is a need to have a cleaner foundation 
broken into several levels
+
+- io
+- tokenization
+- parsing according to structure
+- COS level document
+- PD level document
+- add some self healing mechanism to process corrupt files
+
+In addition handling documents which are not conforming shouldn't be part of 
the core parser
+but of a extentable approach e.g. by adding hooks to allow for handling 
parsing exceptions.
+
+## <span class="complete">Add the ability to create PDFs using unicode encoded 
text</span>
+
+<span class="complete">The recent PDFBox version is limited to WinANSI encoded 
text. 2.0.0 should have unicode support as well.</span>
+
+## Rearchitect the COS level objects
+
+The COS level objects need to be refactored to be in line with the new parser. 
In addition
+method signatures, constructing ... should be made similar across the COS 
objects
+
+## Parsing on demand
+
+Instead of always parsing the complete document PDFs should be parsable on 
demand making
+objects only available as they are needed to enhance performance and minimize 
memory footprint.
+
+This might be achieved by providing a layered approach where a base (non 
caching) parser provides
+the on demand parsing and a caching parser built on top caches objects for use 
cases where
+this is beneficial e.g. rendering, debugging ...
+
+- the lexer would be the low level component delivering tokens to the parser.
+  A sample implementation exists as part of PDFBOX-1000. The benefit would be 
a clean low
+  level handling of tokens. The current implementation needs to be (slightly 
?) revised though
+- the incremental (non caching) parser would allow for page by page processing 
moving forward 
+  only to support text extraction, merging, splitting â¦ - the benefit would 
be a lower memory 
+  consumption as well as a potential faster processing
+- the caching parser would support applications such a PDFDebugger or 
PDFReader 
+
+## Handling of PDF versions
+The current implementation is a mix of PDF 1.4 and some adhoc additions 
without a clear 
+distinction what is and is not supported. We could ad some support for 
explicitly handling
+versions in PDFBox e.g. my marking certain methods and properties to the PDF 
version support
+level. This could in addition be a good basis for PDF/A and other compliance 
checks. 
+

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/ideas.mdtext
----------------------------------------------------------------------
diff --git a/content/ideas.mdtext b/content/ideas.mdtext
deleted file mode 100644
index 3090f23..0000000
--- a/content/ideas.mdtext
+++ /dev/null
@@ -1,88 +0,0 @@
----
-layout: default
-title:  Ideas
----
-
-# Ideas
-
-There are several ideas to enhance PDFBox. These are outlined below together 
with 
-comments and the releases they are planned for as soon as there is agreement 
to do the
-implementation.
-
-## Enhance type safety
-
-Enhance the type safety of PDFBox and add more generic collections and code 
cleanup.
-
-## Remove all deprecated methods
-
-This is an ongoing effort and most/all deprecated methods will be removed in 
PDFBox 2.0.0
-
-## Handle large PDF files
-
-In addition to the PDF parsing pdfbox does not always handle large PDF files 
well as some 
-of the references are implemented as int instead of long
-
-
-## <span class="complete">Switch to Java 1.6</span>
-
-<span class="complete">PDFBox 2.0.0 has Java 6 as minimum requirement.</span>
-
-## <span class="complete">Break PDFBox into modules</span>
-
-<span class="complete">In order to support different use cases and provide a 
minimal toolset PDFBox 2.0.0 should be 
-separated into different modules. This goes inline with rearranging some of 
the code
-e.g. remove AWT from PDDocument.
-</span>
-
-## <span class="complete">Enhance the font rendering</span>
-
-<span class="complete">PDFBox 2.0.0 will render most of the fonts without 
using AWT.</span>
- 
-## Replace/enhance PDF parsing
-
-<span class="complete">The old "classic" PDF parser in PDFBox is not in line 
with the PDF specification as it parses
-a PDF from top to bottom instead of respecting the XRef information.</span> 
The NonSequentialParser
-enhanced that situation but there is a need to have a cleaner foundation 
broken into several levels
-
-- io
-- tokenization
-- parsing according to structure
-- COS level document
-- PD level document
-- add some self healing mechanism to process corrupt files
-
-In addition handling documents which are not conforming shouldn't be part of 
the core parser
-but of a extentable approach e.g. by adding hooks to allow for handling 
parsing exceptions.
-
-## <span class="complete">Add the ability to create PDFs using unicode encoded 
text</span>
-
-<span class="complete">The recent PDFBox version is limited to WinANSI encoded 
text. 2.0.0 should have unicode support as well.</span>
-
-## Rearchitect the COS level objects
-
-The COS level objects need to be refactored to be in line with the new parser. 
In addition
-method signatures, constructing ... should be made similar across the COS 
objects
-
-## Parsing on demand
-
-Instead of always parsing the complete document PDFs should be parsable on 
demand making
-objects only available as they are needed to enhance performance and minimize 
memory footprint.
-
-This might be achieved by providing a layered approach where a base (non 
caching) parser provides
-the on demand parsing and a caching parser built on top caches objects for use 
cases where
-this is beneficial e.g. rendering, debugging ...
-
-- the lexer would be the low level component delivering tokens to the parser.
-  A sample implementation exists as part of PDFBOX-1000. The benefit would be 
a clean low
-  level handling of tokens. The current implementation needs to be (slightly 
?) revised though
-- the incremental (non caching) parser would allow for page by page processing 
moving forward 
-  only to support text extraction, merging, splitting â¦ - the benefit would 
be a lower memory 
-  consumption as well as a potential faster processing
-- the caching parser would support applications such a PDFDebugger or 
PDFReader 
-
-## Handling of PDF versions
-The current implementation is a mix of PDF 1.4 and some adhoc additions 
without a clear 
-distinction what is and is not supported. We could ad some support for 
explicitly handling
-versions in PDFBox e.g. my marking certain methods and properties to the PDF 
version support
-level. This could in addition be a good basis for PDF/A and other compliance 
checks. 
-

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/index.md
----------------------------------------------------------------------
diff --git a/content/index.md b/content/index.md
new file mode 100644
index 0000000..d3eec51
--- /dev/null
+++ b/content/index.md
@@ -0,0 +1,65 @@
+---
+layout: default
+title:  A Java PDF Library
+---
+# Apache PDFBox - A Java PDF Library
+
+<p class="lead">The Apache PDFBoxâ¢ library is an open source Java tool for 
working with
+    PDF documents. This project allows creation of new PDF documents, 
manipulation of existing
+    documents and the ability to extract content from documents.
+
+    Apache PDFBox also includes several command line utilities.
+    Apache PDFBox is published under the Apache License v2.0.</p>
+    
+## News
+With the initial discussions starting 3 years ago PDFBox 2.0.0 is in the works 
for quite some time now - **and we are in the final stages!** To give you the 
opportunity to provide feedback a [PDFBox 2.0.0-RC1 Release 
Candidate](http://pdfbox.apache.org/download.cgi) is now available. The 
[Migration Guide](http://pdfbox.apache.org/2.0/migration.html) shall give users 
coming from PDFBox 1.8 or earlier an overview about things to look at when 
switching over. More details to come.
+
+## Getting Help ##
+
+To get help on using PDFBox, please [Subscribe to the Users Mailing 
List](mailto:users-subscr...@pdfbox.apache.org) and post your
+questions there. We're happy to help.
+
+The project is a volunteer effort and we're always looking for interested 
people to help
+us improve PDFBox. There are a multitude of ways that you can help us 
depending on your
+skills. Subscribe to the [Mailing Lists](/mailinglists.html) and find out how 
you can help.
+
+<h2 id="features">Features</h2>
+
+<div class="row">
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Extract Text</h4></header>
+        <p>Extract Unicode text from PDF files.</p>
+    </div>
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Split &amp; 
Merge</h4></header>
+        <p>Split a single PDF into many files or merge multiple PDF files.</p>
+    </div>
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Fill Forms</h4></header>
+        <p>Extract data from PDF forms or fill a PDF form.</p>
+    </div>
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Preflight</h4></header>
+        <p>Validate PDF files against the PDF/A-1b standard.</p>
+    </div>
+</div>
+
+<div class="row">
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Print</h4></header>
+        <p>Print a PDF file using the standard Java printing API.</p>
+    </div>
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Save as Image</h4></header>
+        <p>Save PDFs as image files, such as PNG or JPEG.</p>
+    </div>
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Create PDFs</h4></header>
+        <p>Create a PDF from scratch, with embedded fonts and images.</p>
+    </div>
+    <div class="col-md-3">
+        <header><h4><span class="oi oi-box"></span>Signing</h4></header>
+        <p>Digitally sign PDF files.</p>
+    </div>
+</div>
+

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/index.mdtext
----------------------------------------------------------------------
diff --git a/content/index.mdtext b/content/index.mdtext
deleted file mode 100644
index d3eec51..0000000
--- a/content/index.mdtext
+++ /dev/null
@@ -1,65 +0,0 @@
----
-layout: default
-title:  A Java PDF Library
----
-# Apache PDFBox - A Java PDF Library
-
-<p class="lead">The Apache PDFBoxâ¢ library is an open source Java tool for 
working with
-    PDF documents. This project allows creation of new PDF documents, 
manipulation of existing
-    documents and the ability to extract content from documents.
-
-    Apache PDFBox also includes several command line utilities.
-    Apache PDFBox is published under the Apache License v2.0.</p>
-    
-## News
-With the initial discussions starting 3 years ago PDFBox 2.0.0 is in the works 
for quite some time now - **and we are in the final stages!** To give you the 
opportunity to provide feedback a [PDFBox 2.0.0-RC1 Release 
Candidate](http://pdfbox.apache.org/download.cgi) is now available. The 
[Migration Guide](http://pdfbox.apache.org/2.0/migration.html) shall give users 
coming from PDFBox 1.8 or earlier an overview about things to look at when 
switching over. More details to come.
-
-## Getting Help ##
-
-To get help on using PDFBox, please [Subscribe to the Users Mailing 
List](mailto:users-subscr...@pdfbox.apache.org) and post your
-questions there. We're happy to help.
-
-The project is a volunteer effort and we're always looking for interested 
people to help
-us improve PDFBox. There are a multitude of ways that you can help us 
depending on your
-skills. Subscribe to the [Mailing Lists](/mailinglists.html) and find out how 
you can help.
-
-<h2 id="features">Features</h2>
-
-<div class="row">
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Extract Text</h4></header>
-        <p>Extract Unicode text from PDF files.</p>
-    </div>
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Split &amp; 
Merge</h4></header>
-        <p>Split a single PDF into many files or merge multiple PDF files.</p>
-    </div>
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Fill Forms</h4></header>
-        <p>Extract data from PDF forms or fill a PDF form.</p>
-    </div>
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Preflight</h4></header>
-        <p>Validate PDF files against the PDF/A-1b standard.</p>
-    </div>
-</div>
-
-<div class="row">
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Print</h4></header>
-        <p>Print a PDF file using the standard Java printing API.</p>
-    </div>
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Save as Image</h4></header>
-        <p>Save PDFs as image files, such as PNG or JPEG.</p>
-    </div>
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Create PDFs</h4></header>
-        <p>Create a PDF from scratch, with embedded fonts and images.</p>
-    </div>
-    <div class="col-md-3">
-        <header><h4><span class="oi oi-box"></span>Signing</h4></header>
-        <p>Digitally sign PDF files.</p>
-    </div>
-</div>
-

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/mailinglists.md
----------------------------------------------------------------------
diff --git a/content/mailinglists.md b/content/mailinglists.md
new file mode 100644
index 0000000..4cbb80b
--- /dev/null
+++ b/content/mailinglists.md
@@ -0,0 +1,28 @@
+---
+layout: default
+title:  Mailing Lists
+---
+
+# Mailing Lists
+
+Mailing Lists are the primary communication channels for all projects at 
+The Apache Software Foundation. Therefore, this applies to Apache PDFBox, too. 
+
+**Please read the [public forum archive 
policy](http://www.apache.org/foundation/public-archives.html) carefully before 
subscribing to one of our list.**
+
+If you have any questions about or problems with Apache PDFBox, you can get 
them addressed 
+on the **Users Mailing List**. 
+
+If you like to participate in the development of Apache PDFBox, 
+the **Developers Mailing List** is the place to be. 
+
+If you like to keep track of what's being changed inside the project, you can 
subscribe 
+to the **Commit Mailing List**.
+
+<p class="alert alert-info">Please use the Users Mailing List if you are 
unsure which list to use</p>
+
+| Name | Address | Subscribe | Unsubscribe | Help | Archive | MarkMail |
+| --- | --- | --- | ---| ---| --- | --- |
+| Users | us...@pdfbox.apache.org | 
[Subscribe](mailto:users-subscr...@pdfbox.apache.org) | 
[Unsubscribe](mailto:users-unsubscr...@pdfbox.apache.org) | 
[Help](mailto:users-h...@pdfbox.apache.org) | 
[Archive](http://mail-archives.apache.org/mod_mbox/pdfbox-users/) | 
[MarkMail](http://pdfbox-users.markmail.org/) |
+| Developers | d...@pdfbox.apache.org | 
[Subscribe](mailto:dev-subscr...@pdfbox.apache.org) | 
[Unsubscribe](mailto:dev-unsubscr...@pdfbox.apache.org) | 
[Help](mailto:dev-h...@pdfbox.apache.org) | 
[Archive](http://mail-archives.apache.org/mod_mbox/pdfbox-dev/) | 
[MarkMail](http://pdfbox-dev.markmail.org/) |     
+| Commits List | commits@pdfbox.apache.org | 
[Subscribe](mailto:commits-subscr...@pdfbox.apache.org) | 
[Unsubscribe](mailto:commits-unsubscr...@pdfbox.apache.org) | 
[Help](mailto:commits-h...@pdfbox.apache.org) | 
[Archive](http://mail-archives.apache.org/mod_mbox/pdfbox-commits/) | 
[MarkMail](http://pdfbox-commits.markmail.org/) |    

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/mailinglists.mdtext
----------------------------------------------------------------------
diff --git a/content/mailinglists.mdtext b/content/mailinglists.mdtext
deleted file mode 100644
index 4cbb80b..0000000
--- a/content/mailinglists.mdtext
+++ /dev/null
@@ -1,28 +0,0 @@
----
-layout: default
-title:  Mailing Lists
----
-
-# Mailing Lists
-
-Mailing Lists are the primary communication channels for all projects at 
-The Apache Software Foundation. Therefore, this applies to Apache PDFBox, too. 
-
-**Please read the [public forum archive 
policy](http://www.apache.org/foundation/public-archives.html) carefully before 
subscribing to one of our list.**
-
-If you have any questions about or problems with Apache PDFBox, you can get 
them addressed 
-on the **Users Mailing List**. 
-
-If you like to participate in the development of Apache PDFBox, 
-the **Developers Mailing List** is the place to be. 
-
-If you like to keep track of what's being changed inside the project, you can 
subscribe 
-to the **Commit Mailing List**.
-
-<p class="alert alert-info">Please use the Users Mailing List if you are 
unsure which list to use</p>
-
-| Name | Address | Subscribe | Unsubscribe | Help | Archive | MarkMail |
-| --- | --- | --- | ---| ---| --- | --- |
-| Users | us...@pdfbox.apache.org | 
[Subscribe](mailto:users-subscr...@pdfbox.apache.org) | 
[Unsubscribe](mailto:users-unsubscr...@pdfbox.apache.org) | 
[Help](mailto:users-h...@pdfbox.apache.org) | 
[Archive](http://mail-archives.apache.org/mod_mbox/pdfbox-users/) | 
[MarkMail](http://pdfbox-users.markmail.org/) |
-| Developers | d...@pdfbox.apache.org | 
[Subscribe](mailto:dev-subscr...@pdfbox.apache.org) | 
[Unsubscribe](mailto:dev-unsubscr...@pdfbox.apache.org) | 
[Help](mailto:dev-h...@pdfbox.apache.org) | 
[Archive](http://mail-archives.apache.org/mod_mbox/pdfbox-dev/) | 
[MarkMail](http://pdfbox-dev.markmail.org/) |     
-| Commits List | commits@pdfbox.apache.org | 
[Subscribe](mailto:commits-subscr...@pdfbox.apache.org) | 
[Unsubscribe](mailto:commits-unsubscr...@pdfbox.apache.org) | 
[Help](mailto:commits-h...@pdfbox.apache.org) | 
[Archive](http://mail-archives.apache.org/mod_mbox/pdfbox-commits/) | 
[MarkMail](http://pdfbox-commits.markmail.org/) |    

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/references.md
----------------------------------------------------------------------
diff --git a/content/references.md b/content/references.md
new file mode 100644
index 0000000..1805246
--- /dev/null
+++ b/content/references.md
@@ -0,0 +1,48 @@
+---
+layout: default
+title:  External Links
+---
+
+# External Links
+
+This page lists projects that utilize PDFBox and articles that have been 
written about PDFBox. 
+Please file an [improvement 
issue](https://issues.apache.org/jira/browse/PDFBOX) to get new projects or 
articles added to this page, or to update the information on existing links.
+
+## Projects
+
+| Project Name | License | Project Description |
+| --- | --- | --- |
+| [Alfresco](http://www.alfresco.org/) | LGPL - commercial 
services/support/training is available | Alfresco is an open source, 
open-standards content repository built by the most experienced content 
management team that includes the co-founder of Documentum.|
+| [Apache Nutch](http://nutch.apache.org/) | Apache License V2.0 | Apache 
Nutch is open source web-search software. It builds on Apache Lucene, adding 
web-specifics, such as a crawler, a link-graph database, parsers for HTML and 
other document formats, etc.|
+| [Apache Tika](http://tika.apache.org/) | Apache License V2.0 | Apache Tika 
is a toolkit for detecting and extracting metadata and structured text content 
from various documents using existing parser libraries.|
+| [Centric CRM](http://www.centriccrm.com/) | Free To Use But 
Restricted/Commercial | The Most Advanced Open Source CRM Software.|
+| [Canoo Webtest](http://webtest.canoo.com/webtest/manual/WebTestHome.html) | 
BSD Like | Free OpenSource tool for XP-style acceptance testing of Java-based 
Web applications.|
+| [contineo](http://webtest.canoo.com/webtest/manual/WebTestHome.html) | GPL | 
Contineo is a web based document management system.|
+| [ECM REWOO Scope](http://www.rewoo.de/) | Commercial | REWOO Scope is an 
Enterprise Content Management (ECM) software to organize, structure and 
consolidate enterprise data. Apache PDFBox is an integral part to read and 
index PDF documents.|
+| [Jahia](http://www.jahia.org/) | collaborative source license | The Jahia 
product is currently the most powerful, ready-to-use and affordable integrated 
midrange Java Content Management and Corporate Portal Server.|
+| [jLibrary](http://jlibrary.sourceforge.net/) | BSD | jLibrary is a Document 
Management System, oriented for personal and enterprise use.|
+| [Jomic](http://jomic.sourceforge.net/) | GPL | Jomic is a viewer for comic 
book archives.|
+| [JpdfUnit](http://jpdfunit.sourceforge.net/) | Apache License V2.0 | pdfUnit 
is a framework for testing a generated pdf document with the JUnit Test 
Framework.|
+| [Liferay Portal](http://www.liferay.com/) | MIT | Liferay Portal is an open 
source portal that helps organizations collaborate more efficiently by 
providing a consolidated view of disparate applications.|
+| [LIUS](http://www.bibl.ulaval.ca/lius/index.en.html) | GPL | LIUS is an 
indexing Java framework based on the Jakarta Lucene project. The LIUS framework 
adds to Lucene many files format indexing fonctionalities as: Ms World, Ms 
Excel, Ms PowerPoint, RTF, PDF, XML, HTML, TXT, Open Office suite and 
JavaBeans.|
+| [LuceGene](http://gmod.org/wiki/LuceGene) | Artistic License | LuceGene is 
an open-source document/object search and retrieval system specially tuned for 
bioinformatics text databases and documents.|
+| [Lutece](http://www.lutece.paris.fr/) | BSD-like | Lutece is a portal engine 
which allows you to easily create your websites or intranets based upon 
HTML,XML content.|
+| [MMBase Lucene Module](http://mmapps.sourceforge.net/lucenemodule/) | MPL | 
Lucenemodule is a plugin (module) for the MMBase content management system that 
enables Lucene full text search through it's content, and thanks to PDFBox also 
PDF content.|
+| [OpenCms](http://www.opencms.org/) | Custom | OpenCms is a professional 
level Open Source Website Content Management System.|
+| [OpenSearchServer](http://www.open-search-server.com/) | GPLv3 | An open 
source search engine and crawler based on best open source technologies. It is 
a modern search engine and a suite of high-powered full text search algorithms.|
+| [Orbeon PresentationServer](http://forge.objectweb.org/projects/ops) | LGPL 
| Orbeon PresentationServer (OPS) is an open source J2EE-based platform for 
XML-centric web applications. OPS is built around XHTML, XForms, XSLT, XML 
pipelines, and Web Services, which makes it ideal for applications that 
capture, process and present XML data. Commercial consulting/training/support 
is available through orbeon.|
+| [PDFcat](http://pdfcat.sourceforge.net/) | LGPL | PDFcat is multi-platform 
catalog manager that provides searching capability over documents among virtual 
catalogs.|
+| [SearchBlox](http://www.searchblox.com/) | Commercial | SearchBlox is a 
high-performance corporate search software designed for the Java 2 Enterprise 
Edition (J2EE) platform.|
+| [SimplexRepaginator](http://www.simplexrepaginator.com/) | Apache License 
V2.0 | Simplex Repaginator converts simplex-scanned PDFs into properly 
duplex-paginated PDFs and vice versa. |
+| [Terrier](http://ir.dcs.gla.ac.uk/terrier/) | MPL | Terrier is software for 
the rapid development of Web, intranet and desktop search engines.|
+| [Triboni GinkGO](http://www.triboni.com/) | Commercial | Triboni GinkGO is a 
highly scalable J2EE services platform that is based on a simple XML business 
object defintion and scripting language. Toghether with XSLT content centric 
web applications can be configured in a very short time.|
+| [Zilverline](http://www.zilverline.org/) | Collaborative Source License | 
Zilverline is a search engine that offers web access to your personal or 
intranet content.|
+
+## Articles/Books
+
+| Article Name | Article Abstract|
+| --- | --- |
+| Build an eDoc Reader for your iPod <br/> [Part 1 - User 
Interface](http://www.oreillynet.com/pub/a/mac/2004/12/14/ipod_reader.html) 
<br/> [Part 2 - Document Reading 
Engine](http://www.oreillynet.com/pub/a/mac/2004/12/17/ipod_reader.html) <br/> 
[Part 3 - *Integration with 
PDFBox*](http://www.oreillynet.com/pub/a/mac/2005/01/07/ipod_reader.html) | A 
three part article that discusses the implementation of the PodReader 
application. PodReader is Cocoa application written in Objective-C and article 
discusses how to use the Cocoa-Java bridge to integrate with the Java version 
of PDFBox.|
+| [Lucene In Action](http://www.manning.com/hatcher2/) | A book that discusses 
integrating with the lucene search engine. One chapter discusses how to index 
various file formats and highlights PDFBox for indexing PDF documents.|
+| [Java Developers Journal - March 2005](http://java.sys-con.com/node/48543) | 
An article written by the lead developer of PDFBox discussing text extraction 
and AcroForm integration using PDFBox functionality.|
+| [Refactoring trends across N versions of N Java open source systems: an 
empirical 
study](http://www.dcs.bbk.ac.uk/research/techreps/2005/bbkcs-05-02.pdf) | This 
article describes an empirical study of multiple versions of a range of open 
source Java systems in an attempt to understand whether refactoring occur and, 
if so, which types of refactoring were most (and least) common. PDFBox is used 
as a case study. |
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/references.mdtext
----------------------------------------------------------------------
diff --git a/content/references.mdtext b/content/references.mdtext
deleted file mode 100644
index 1805246..0000000
--- a/content/references.mdtext
+++ /dev/null
@@ -1,48 +0,0 @@
----
-layout: default
-title:  External Links
----
-
-# External Links
-
-This page lists projects that utilize PDFBox and articles that have been 
written about PDFBox. 
-Please file an [improvement 
issue](https://issues.apache.org/jira/browse/PDFBOX) to get new projects or 
articles added to this page, or to update the information on existing links.
-
-## Projects
-
-| Project Name | License | Project Description |
-| --- | --- | --- |
-| [Alfresco](http://www.alfresco.org/) | LGPL - commercial 
services/support/training is available | Alfresco is an open source, 
open-standards content repository built by the most experienced content 
management team that includes the co-founder of Documentum.|
-| [Apache Nutch](http://nutch.apache.org/) | Apache License V2.0 | Apache 
Nutch is open source web-search software. It builds on Apache Lucene, adding 
web-specifics, such as a crawler, a link-graph database, parsers for HTML and 
other document formats, etc.|
-| [Apache Tika](http://tika.apache.org/) | Apache License V2.0 | Apache Tika 
is a toolkit for detecting and extracting metadata and structured text content 
from various documents using existing parser libraries.|
-| [Centric CRM](http://www.centriccrm.com/) | Free To Use But 
Restricted/Commercial | The Most Advanced Open Source CRM Software.|
-| [Canoo Webtest](http://webtest.canoo.com/webtest/manual/WebTestHome.html) | 
BSD Like | Free OpenSource tool for XP-style acceptance testing of Java-based 
Web applications.|
-| [contineo](http://webtest.canoo.com/webtest/manual/WebTestHome.html) | GPL | 
Contineo is a web based document management system.|
-| [ECM REWOO Scope](http://www.rewoo.de/) | Commercial | REWOO Scope is an 
Enterprise Content Management (ECM) software to organize, structure and 
consolidate enterprise data. Apache PDFBox is an integral part to read and 
index PDF documents.|
-| [Jahia](http://www.jahia.org/) | collaborative source license | The Jahia 
product is currently the most powerful, ready-to-use and affordable integrated 
midrange Java Content Management and Corporate Portal Server.|
-| [jLibrary](http://jlibrary.sourceforge.net/) | BSD | jLibrary is a Document 
Management System, oriented for personal and enterprise use.|
-| [Jomic](http://jomic.sourceforge.net/) | GPL | Jomic is a viewer for comic 
book archives.|
-| [JpdfUnit](http://jpdfunit.sourceforge.net/) | Apache License V2.0 | pdfUnit 
is a framework for testing a generated pdf document with the JUnit Test 
Framework.|
-| [Liferay Portal](http://www.liferay.com/) | MIT | Liferay Portal is an open 
source portal that helps organizations collaborate more efficiently by 
providing a consolidated view of disparate applications.|
-| [LIUS](http://www.bibl.ulaval.ca/lius/index.en.html) | GPL | LIUS is an 
indexing Java framework based on the Jakarta Lucene project. The LIUS framework 
adds to Lucene many files format indexing fonctionalities as: Ms World, Ms 
Excel, Ms PowerPoint, RTF, PDF, XML, HTML, TXT, Open Office suite and 
JavaBeans.|
-| [LuceGene](http://gmod.org/wiki/LuceGene) | Artistic License | LuceGene is 
an open-source document/object search and retrieval system specially tuned for 
bioinformatics text databases and documents.|
-| [Lutece](http://www.lutece.paris.fr/) | BSD-like | Lutece is a portal engine 
which allows you to easily create your websites or intranets based upon 
HTML,XML content.|
-| [MMBase Lucene Module](http://mmapps.sourceforge.net/lucenemodule/) | MPL | 
Lucenemodule is a plugin (module) for the MMBase content management system that 
enables Lucene full text search through it's content, and thanks to PDFBox also 
PDF content.|
-| [OpenCms](http://www.opencms.org/) | Custom | OpenCms is a professional 
level Open Source Website Content Management System.|
-| [OpenSearchServer](http://www.open-search-server.com/) | GPLv3 | An open 
source search engine and crawler based on best open source technologies. It is 
a modern search engine and a suite of high-powered full text search algorithms.|
-| [Orbeon PresentationServer](http://forge.objectweb.org/projects/ops) | LGPL 
| Orbeon PresentationServer (OPS) is an open source J2EE-based platform for 
XML-centric web applications. OPS is built around XHTML, XForms, XSLT, XML 
pipelines, and Web Services, which makes it ideal for applications that 
capture, process and present XML data. Commercial consulting/training/support 
is available through orbeon.|
-| [PDFcat](http://pdfcat.sourceforge.net/) | LGPL | PDFcat is multi-platform 
catalog manager that provides searching capability over documents among virtual 
catalogs.|
-| [SearchBlox](http://www.searchblox.com/) | Commercial | SearchBlox is a 
high-performance corporate search software designed for the Java 2 Enterprise 
Edition (J2EE) platform.|
-| [SimplexRepaginator](http://www.simplexrepaginator.com/) | Apache License 
V2.0 | Simplex Repaginator converts simplex-scanned PDFs into properly 
duplex-paginated PDFs and vice versa. |
-| [Terrier](http://ir.dcs.gla.ac.uk/terrier/) | MPL | Terrier is software for 
the rapid development of Web, intranet and desktop search engines.|
-| [Triboni GinkGO](http://www.triboni.com/) | Commercial | Triboni GinkGO is a 
highly scalable J2EE services platform that is based on a simple XML business 
object defintion and scripting language. Toghether with XSLT content centric 
web applications can be configured in a very short time.|
-| [Zilverline](http://www.zilverline.org/) | Collaborative Source License | 
Zilverline is a search engine that offers web access to your personal or 
intranet content.|
-
-## Articles/Books
-
-| Article Name | Article Abstract|
-| --- | --- |
-| Build an eDoc Reader for your iPod <br/> [Part 1 - User 
Interface](http://www.oreillynet.com/pub/a/mac/2004/12/14/ipod_reader.html) 
<br/> [Part 2 - Document Reading 
Engine](http://www.oreillynet.com/pub/a/mac/2004/12/17/ipod_reader.html) <br/> 
[Part 3 - *Integration with 
PDFBox*](http://www.oreillynet.com/pub/a/mac/2005/01/07/ipod_reader.html) | A 
three part article that discusses the implementation of the PodReader 
application. PodReader is Cocoa application written in Objective-C and article 
discusses how to use the Cocoa-Java bridge to integrate with the Java version 
of PDFBox.|
-| [Lucene In Action](http://www.manning.com/hatcher2/) | A book that discusses 
integrating with the lucene search engine. One chapter discusses how to index 
various file formats and highlights PDFBox for indexing PDF documents.|
-| [Java Developers Journal - March 2005](http://java.sys-con.com/node/48543) | 
An article written by the lead developer of PDFBox discussing text extraction 
and AcroForm integration using PDFBox functionality.|
-| [Refactoring trends across N versions of N Java open source systems: an 
empirical 
study](http://www.dcs.bbk.ac.uk/research/techreps/2005/bbkcs-05-02.pdf) | This 
article describes an empirical study of multiple versions of a range of open 
source Java systems in an attempt to understand whether refactoring occur and, 
if so, which types of refactoring were most (and least) common. PDFBox is used 
as a case study. |
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/c68c6530/content/support.md
----------------------------------------------------------------------
diff --git a/content/support.md b/content/support.md
new file mode 100644
index 0000000..0969b5e
--- /dev/null
+++ b/content/support.md
@@ -0,0 +1,53 @@
+---
+layout: default
+title:  Support
+---
+
+# Support
+
+## Questions about How to use PDFBox
+
+If you have questions about how to use PDFBox do ask on the [Users Mailing 
List](/mailinglists.html "Subscribe to Mailing List"). This will get you help 
from the entire community.
+
+The PDFBox examples and the test code in the sources will also provide 
additional information.
+
+And there are additonal resources available on sites such as [Stack 
Overflow](http://stackoverflow.com/search?q=pdfbox "Stack Overflow").
+
+
+## Filing a bug report or enhancement request
+
+<p class="alert alert-info">Please refrain from immediately opening a ticket 
in the issue tracker unless 
+you are really certain it's a problem in the PDFBox software. Try using the 
Mailing Lists 
+before.</p>
+
+If you are sure you have found a bug the please report the problem in our 
+[Issue Tracker](https://issues.apache.org/jira/browse/PDFBOX). 
+
+**Before you submit a bug there are several things you can try first**
+
+ - for issues with text extraction try if Adobe Reader can extract the text
+ - try the latest SNAPSHOT to see if it's fixed in the pre-release
+ - search the mailing list to see if has been discussed before
+ - check the issue tracker to see if the issue has already been reported
+
+**To help us resolving a bug quicker**
+
+ - attach the PDF that makes trouble by using "More", "Attach files" in the 
issue tracker
+ - if your file is too large, upload it to a sharehoster, or use the PDFSplit 
application to isolate the troublesome page
+ - mention the PDFBox version you are using.
+ - attach the shortest possible code that reproduces the problem. Insert java 
code between {code}...{code}. Or try to reproduce the problem with the command 
line applications.
+ - mention what you were doing, what was the expected behaviour, and what 
happened instead
+ - provide a stack trace of an exception if there is one
+ - try using the non-sequential parser (loadNonSeq() instead of load(), and 
"-nonSeq" with the command line applications)
+ - search JIRA if your problem has been mentioned before.
+ - Be patient: all the people here are unpaid volunteers who work for you in 
their free time
+
+**And please DON'T**
+
+ - upload files to a hoster that requires registration to read the file.
+ - create an issue in JIRA and then go on vacation so you won't repond to our 
questions / suggestions.
+ - ask "how to" questions in JIRA. Ask such questions on the mailing lists, on 
stackoverflow.com, and look at the sample and the test code in the sources.
+ - attach PDF files with confidential and/or personal data (name, DoB, bank 
data, health data, SSN) without getting permission from the client and/or the 
people mentioned on the PDF
+ - create issues about obsolete PDFBox versions
+
+<p class="alert alert-info">We can sometimes solve problems without having the 
PDF, but it is difficult.</p>

[2/3] pdfbox-docs git commit: PDFBOX-3040: use .md for markdown files

Reply via email to