Thanks for the info (comments inline), More generally:
<quote source="http://jakarta.apache.org/poi/"> Lucene for which we'll soon have file format interpretors. </quote> I guess I have the latest information on Word support. Any advances for excel format? Any alpha quality code I could test? cheers, sv On Fri, 16 Apr 2004, Ryan Ackley wrote: > Stephane, > > The textmining.org became sort of a stop gap to support people who wanted to > extract text from Word docs while I was working on HWPF. However now there > is a feature in the textmining.org library that I don't plan on adding to > HWPF and that is support for Word 6.0/95. Is there a reason for this? > The post I made to lucene-user about PowerPoint to text was a repost from > poi-user that someone had posted. I haven' t gotten around to testing it out > but I have referred several people to it and I haven't heard back from them, > so I assume it works. ok, I'll probably add the original post to the lucene wiki so as to not lose the information. > The relationship between textmining.org and POI is that I am the principal > author of HWPF and I am the principal author of the textmining.org > libraries. I should just donate it to lucene because it is becoming a major > hassle to maintain. Although I don't know...it has gotten me some side work. > So I don't know what I plan on doing with it. Side work is good ;) I know of a few people who happily use the package. As a future user of your contributions, I'd like to thank you is advance. > -Ryan > > ----- Original Message ----- > From: "Stephane James Vaucher" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Friday, April 16, 2004 3:52 PM > Subject: POI & Lucene integration > > > > Hi everyone. > > > > I'm planning on using POI to add MSOffice doc support to my app using > > Lucene. I know there's been work going on to facilitate the integration. > > I've checked-out the latest dist out of cvs, did a grep -i lucene on the > > *java files. Found nothing. Is the work available somewhere (or any > > interesting references)? > > > > On another note, I'm trying out TextMining, and I'm a bit confused. It > > comes distributed with classes in a org.apache.poi package I can't find in > > the poi dist: poi-bin-2.5-final-20040302.tar.gz, specifically: > > org/apache/poi/hwpf/* > > > > What is the relationship between the projects? > > > > Slightly OT, Ryan, in this message: > > > > http://marc.theaimsgroup.com/?l=lucene-user&m=108030527420219&w=4 > > > > you mentioned maybe adding a basic support for powerpoint doc text > > extraction. Has anyone looked at this? > > > > cheers, > > Stephane Vaucher > > CIRANO > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
