Author: mattmann
Date: Fri Nov 11 00:09:11 2016
New Revision: 1769231

URL: http://svn.apache.org/viewvc?rev=1769231&view=rev
Log:
Generate index for Apache Tika 1.14.

Added:
    tika/site/src/site/apt/1.14/index.apt

Added: tika/site/src/site/apt/1.14/index.apt
URL: 
http://svn.apache.org/viewvc/tika/site/src/site/apt/1.14/index.apt?rev=1769231&view=auto
==============================================================================
--- tika/site/src/site/apt/1.14/index.apt (added)
+++ tika/site/src/site/apt/1.14/index.apt Fri Nov 11 00:09:11 2016
@@ -0,0 +1,156 @@
+                       ----------------
+                       Apache Tika 1.14
+                       ----------------
+
+~~ Licensed to the Apache Software Foundation (ASF) under one or more
+~~ contributor license agreements.  See the NOTICE file distributed with
+~~ this work for additional information regarding copyright ownership.
+~~ The ASF licenses this file to You under the Apache License, Version 2.0
+~~ (the "License"); you may not use this file except in compliance with
+~~ the License.  You may obtain a copy of the License at
+~~
+~~     http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License.
+
+Apache Tika 1.14
+
+        The most notable changes in Tika 1.14 over the previous release are:
+
+        * Extract all headers from MSG/RFC822 
({{{http://issues.apache.org/jira/browse/TIKA-2122}TIKA-2122}}).
+
+        * 9.1 ({{{http://issues.apache.org/jira/browse/TIKA-2113}TIKA-2113}}).
+
+        * Extract PDF DocInfo metadata into separate keys to 
preventoverwriting by XMP metadata 
({{{http://issues.apache.org/jira/browse/TIKA-2057}TIKA-2057}}).
+
+        * Re-enable fileUrl for tika-server 
({{{http://issues.apache.org/jira/browse/TIKA-2081}TIKA-2081}}).  If you 
choose,to use this feature, beware of the security vulnerabilities!See: 
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-3271
+
+        * Add Tesseract's hOCR output format as an option, via Eric 
Pugh({{{http://issues.apache.org/jira/browse/TIKA-2093}TIKA-2093}})
+
+        * Extract macros from MSOffice files 
({{{http://issues.apache.org/jira/browse/TIKA-2069}TIKA-2069}}).
+
+        * Maintain passed-in mime in TXTParser 
({{{http://issues.apache.org/jira/browse/TIKA-2047}TIKA-2047}}).
+
+        * Upgrade to POI.3-15 
({{{http://issues.apache.org/jira/browse/TIKA-2013}TIKA-2013}}).
+
+         * 0.3 ({{{http://issues.apache.org/jira/browse/TIKA-2051}TIKA-2051}}).
+
+         * Fix hyperlinks with formatting in DOC and DOCX 
({{{http://issues.apache.org/jira/browse/TIKA-1255}TIKA-1255}}and 
{{{http://issues.apache.org/jira/browse/TIKA-2078}TIKA-2078}})
+
+         * Tika now is integrated with the Tensorflow library from Googleand 
it can use its Inception v3 image classification model toidentify objects in 
images ({{{http://issues.apache.org/jira/browse/TIKA-1993}TIKA-1993}}).
+
+         * Parser configuration is now type-safe and parameters for parserscan 
have assigned types 
({{{http://issues.apache.org/jira/browse/TIKA-1508}TIKA-1508}}, 
{{{http://issues.apache.org/jira/browse/TIKA-1986}TIKA-1986}}).
+
+         * Prevent OOM/permanent hang on some corrupt CHM files 
({{{http://issues.apache.org/jira/browse/TIKA-2040}TIKA-2040}}).
+
+         * Upgrade ICU4J charset detection components to fix multithreadingbug 
({{{http://issues.apache.org/jira/browse/TIKA-2041}TIKA-2041}}).
+
+         * 1.4 ({{{http://issues.apache.org/jira/browse/TIKA-2039}TIKA-2039}}).
+
+         * Maintain more significant digits in cells of "General" formatin XLS 
and XLSX ({{{http://issues.apache.org/jira/browse/TIKA-2025}TIKA-2025}}).
+
+        * Avoid mark/reset issues when extracting or detecting embedded 
resourcesin RFC822 emails 
({{{http://issues.apache.org/jira/browse/TIKA-2037}TIKA-2037}}).
+
+        * Improving accuracy of Tesseract for better extraction of numericand 
alphanumeric text from images 
({{{http://issues.apache.org/jira/browse/TIKA-2021}TIKA-2021}}, 
{{{http://issues.apache.org/jira/browse/TIKA-2031}TIKA-2031}}).
+
+         * Improve extraction of embedded documents from PPT, PPTX and 
XLSX({{{http://issues.apache.org/jira/browse/TIKA-2026}TIKA-2026}}).
+
+        * Add parser for applefile (AppleSingle) 
({{{http://issues.apache.org/jira/browse/TIKA-2022}TIKA-2022}}).
+
+        * Add mime types, mime magic and/or globs for:
+
+        ** Endnote Import File 
({{{http://issues.apache.org/jira/browse/TIKA-2011}TIKA-2011}})
+
+        ** DJVU files 
({{{http://issues.apache.org/jira/browse/TIKA-2009}TIKA-2009}})
+
+        ** MS Owner File 
({{{http://issues.apache.org/jira/browse/TIKA-2008}TIKA-2008}})
+
+        ** Windows Media Metafile 
({{{http://issues.apache.org/jira/browse/TIKA-2004}TIKA-2004}})
+
+        ** iCal and vCalendar 
({{{http://issues.apache.org/jira/browse/TIKA-2006}TIKA-2006}})
+
+        ** MBOX ({{{http://issues.apache.org/jira/browse/TIKA-2042}TIKA-2042}})
+
+        ** Stata DTA 
({{{http://issues.apache.org/jira/browse/TIKA-2064}TIKA-2064}})
+
+        * Add configurable maximum threshold for number of events 
extractedfrom the XMP Media Management Schema in JempboxExtractor 
({{{http://issues.apache.org/jira/browse/TIKA-1999}TIKA-1999}}).
+
+        * Integrate TesseractOCR with full page image rendering for PDFs 
({{{http://issues.apache.org/jira/browse/TIKA-1994}TIKA-1994}}).
+
+        * Add mime detection via Nick C and parser for DBF files 
({{{http://issues.apache.org/jira/browse/TIKA-1513}TIKA-1513}}).
+
+        * Add mime detection and parsers for MSOffice 2003 XML Wordand Excel 
formats ({{{http://issues.apache.org/jira/browse/TIKA-1958}TIKA-1958}}).
+
+        * Extract hyperlinks from PPT, PPTX, XSLX 
({{{http://issues.apache.org/jira/browse/TIKA-1454}TIKA-1454}}).
+
+
+   The following people have contributed to Tika 1.14 by submitting or
+   commenting on the issues resolved in this release:
+
+        * Aeham Abushwashi
+ 
+        * Alan Hunter
+
+        * Alexander Kazakov
+
+        * Chris A. Mattmann
+
+        * Chris Knott
+
+        * Egbert
+
+        * Eli Trucco
+
+        * Eric Pugh
+
+        * Jean Coudon
+
+        * Jeff Swindle
+
+        * John Dougrez-Lewis
+
+        * John Haynes
+
+        * Joseph Naegele
+
+        * Josh Cummings
+
+        * Ken Krugler
+
+        * Kukushkin Alexander
+
+        * Lewis John McGibbney
+
+        * Luis Filipe Nassif
+
+        * Matthias Pigulla
+
+        * Nam-Quang Tran
+
+        * Nilay Chheda
+
+        * Philipp Steinkrueger
+
+        * Sara Miller
+
+        * Sebastian Iturra
+
+        * Thamme Gowda
+
+        * Tilman Hausherr
+
+        * Tim Allison
+
+        * Tim Barrett
+
+        * Vjeran Marcinko
+
+        * Yahav Amsalem
+
+        * Zarana Parekh
+
+   See {{https://s.apache.org/TRWa}} for more details on these contributions.


Reply via email to