Version 0.4 of the TextMining.org text extraction library has been released!
I have finally gotten around to releasing a new version of the textmining.org text extractor. This is a pure java library for extracting text from Word 6.0/97/2000/XP/2003. Some highlights from this release: -I removed support for PDF documents. I was only wrapping the excellent PDFBox (http://www.pdfbox.org) library with a few lines of code. -I added support for Word 6.0 documents. -The extractor will no longer extract text that has been deleted but is still in the document because of revision tracking -I added two exceptions, PasswordProtectedException and FastSavedException, for more graceful failures. -Fixed bugs -Updated the license to Apache 2.0 A special thanks to BeeText Inc. (http://www.beetext.com) They are a software company that is on the cutting edge of software development for translation professionals. Besides that, they sponsored all of the above changes. Remember...support companies that support open source! -Ryan Ackley --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
