Author: mattmann
Date: Fri Jul 16 16:57:20 2010
New Revision: 964858

URL: http://svn.apache.org/viewvc?rev=964858&view=rev
Log:
- update index doc to include 0.7 skeleton

Modified:
    tika/trunk/src/site/apt/index.apt

Modified: tika/trunk/src/site/apt/index.apt
URL: 
http://svn.apache.org/viewvc/tika/trunk/src/site/apt/index.apt?rev=964858&r1=964857&r2=964858&view=diff
==============================================================================
--- tika/trunk/src/site/apt/index.apt (original)
+++ tika/trunk/src/site/apt/index.apt Fri Jul 16 16:57:20 2010
@@ -1,5 +1,5 @@
                        ---------------
-                       Apache Tika 0.6
+                       Apache Tika 0.7
                        ---------------
 
 ~~ Licensed to the Apache Software Foundation (ASF) under one or more
@@ -17,96 +17,62 @@
 ~~ See the License for the specific language governing permissions and
 ~~ limitations under the License.
 
-Apache Tika 0.6
+Apache Tika 0.7
 
-   The most notable changes in Tika 0.6 over the previous release are:
 
-      * Mime-type detection for HTML (and all types) has been improved,
-        allowing malformed HTML files and those HTML files that require
-        a bit more observed content before the type is properly detected,
-        are now correctly identified by the AutoDetectParser.
-        ({{{https://issues.apache.org/jira/browse/TIKA-327}TIKA-327}},
-         {{{https://issues.apache.org/jira/browse/TIKA-357}TIKA-357}},
-         {{{https://issues.apache.org/jira/browse/TIKA-366}TIKA-366}},
-         {{{https://issues.apache.org/jira/browse/TIKA-367}TIKA-367}})
-
-      * Tika now has an additional OSGi bundle packaging that includes all
-        the required parser libraries. This bundle package makes it easy to
-        use all Tika features in an OSGi environment.
-        ({{{https://issues.apache.org/jira/browse/TIKA-340}TIKA-340}},
-         {{{https://issues.apache.org/jira/browse/TIKA-342}TIKA-342}})
-
-      * The Apache POI dependency used for parsing Microsoft Office file
-        formats has been upgraded to version 3.6. The most visible
-        improvement in this version is the notably reduced ooxml jar file
-        size. The tika-app jar size is now down to 15MB from the 25MB in
-        Tika 0.5.
-        ({{{https://issues.apache.org/jira/browse/TIKA-353}TIKA-353}})
-
-      * Handling of character encoding information in input metadata and
-        HTML \<meta\> tags has been improved. When no applicable encoding
-        information is available, the encoding is detected by looking at
-        the input data.
-        ({{{https://issues.apache.org/jira/browse/TIKA-332}TIKA-332}},
-         {{{https://issues.apache.org/jira/browse/TIKA-334}TIKA-334}},
-         {{{https://issues.apache.org/jira/browse/TIKA-335}TIKA-335}},
-         {{{https://issues.apache.org/jira/browse/TIKA-341}TIKA-341}}) 
-
-      * Some document types like Excel spreadsheets contain content like
-        numbers or formulas whose exact text format depends on the current
-        locale. So far Tika has used the platform default locale in such
-        cases, but clients can now explicitly specify the locale by passing
-        a Locale instance in the parse context.
-        ({{{https://issues.apache.org/jira/browse/TIKA-125}TIKA-125}})
-
-      * The default text output encoding of the tika-app jar is now UTF-8
-        when running on Mac OS X. This is because the default encoding used
-        by Java is not compatible with the console application in Mac OS X.
-        On all other platforms the text output from tika-app still uses
-        the platform default encoding.
-        ({{{https://issues.apache.org/jira/browse/TIKA-324}TIKA-324}})
-
-      * A flash video (video/x-flv) parser has been added.
-        ({{{https://issues.apache.org/jira/browse/TIKA-328}TIKA-328}})
- 
-      * The handling of Number and Date cell formatting within the
-        Microsoft Excel documents has been added. This include currencies,
-        percentages and scientific formats.
-        ({{{https://issues.apache.org/jira/browse/TIKA-103}TIKA-103}})
+   The most notable changes in Tika 0.7 over the previous release are:
 
-   The following people have contributed to Tika 0.6 by submitting or
-   commenting on the issues resolved in this release:
-
-      * Andrzej Bialecki
-
-      * Bertrand Delacretaz
-
-      * Chris A. Mattmann
-
-      * Dave Meikle
-
-      * Erik Hetzner
+      * MP3 file parsing was improved, including Channel and SampleRate 
+        extraction and ID3v2 support 
({{{https://issues.apache.org/jira/browse/TIKA-368}TIKA-368}}, 
+        {{{https://issues.apache.org/jira/browse/TIKA-372}TIKA-372}}). 
Further, audio
+        parsing mime detection was also improved for the MIDI format. 
+        ({{{https://issues.apache.org/jira/browse/TIKA-199}TIKA-199}})
 
-      * Felix Meschberger
+      * Tika no longer relies on X11 for its RTF parsing functionality. 
+        ({{{https://issues.apache.org/jira/browse/TIKA-386}TIKA-386}})
 
-      * Jukka Zitting
+      * A Thread-safe bug in the AutoDetectParser was discovered and 
+        addressed. 
({{{https://issues.apache.org/jira/browse/TIKA-374}TIKA-374}})
 
-      * Julien Nioche
+      * Upgrade to PDFBox 1.0.0. The new PDFBox version improves PDF parsing
+        performance and fixes a number of text extraction issues. 
+        ({{{https://issues.apache.org/jira/browse/TIKA-380}TIKA-380}})
+   
 
-      * Ken Krugler  
-
-      * Luke Nezda
-
-      * Maxim Valyanskiy
-
-      * Niall Pemberton
-
-      * Peter Wolanin 
-
-      * Piotr B.
+   The following people have contributed to Tika 0.7 by submitting or
+   commenting on the issues resolved in this release:
 
-      * Sami Siren
+      * Adam Rauch 
+      
+      * Benson Margulies 
+
+      * Brett S. 
+      
+      * Chris A. Mattmann 
+      
+      * Daan de Wit 
+      
+      * Dave Meikle 
+      
+      * Durville 
+      
+      * Ingo Renner 
+      
+      * Jukka Zitting 
+      
+      * Ken Krugler 
+      
+      * Kenny Neal 
+      
+      * Markus Goldbach
+      
+      * Maxim Valyanskiy 
+      
+      * Nick Burch  
+      
+      * Sami Siren 
+      
+      * Uwe Schindler 
 
-      * Yuan-Fang Li
 
-   See {{http://tinyurl.com/yc3dk67}} for more details on these contributions.
+   See {{http://tinyurl.com/yklopby}} for more details on these contributions.


Reply via email to