About all I know I can say is that its a US gov't project parsing raw text from documents in many many formats. Its not too evil in its purpose ;-)
I have used Lucene in the past back when I worked in the oil industry. Yet another fast indexing engine for oil meta data. This was back when Lucene was first conceived and I even had a class with Erik Hatcher as the presenter. :-) Thanks again, now I just have to read up on how to commit a fix to your source I had to make to enable full text scraping of PPTs. -Eric- David Fisher wrote: > > Eric, > > You are very welcome! It is always encouraging to hear about success > stories! > > Can you tell us about your project, is it commercial, private, or open- > source? > > Do you make use of other Apache projects like Lucene? > > Best Regards, > Dave > > On Jun 5, 2009, at 8:26 AM, Tgui wrote: > >> >> I just wanted to post a quick thanks to any persons who actively >> contribute >> in any way to POI project. I've been an on and off user since the 1.0 >> release and am continually amazed at the progress made. >> >> Just another data point for (y'all), in two days I was able to rip >> Open >> Office from our semantic processing engine and replace with POI. >> Throughput >> went through the roof while of course distribution size fell by 200 >> megs. On >> top of it all the cool extractors for PPT, XLS and Word documents do >> a more >> complete job of pulling raw text than what was done with Open >> Office. More >> keywords, better semantic results! >> >> I hope this post wasn't too much of an annoyance. Having a few open >> source >> projects of my own, I know what its like when people seem to post only >> issues and complaints. >> >> Good job! >> -Eric Morgan- >> -- >> View this message in context: >> http://www.nabble.com/Thank-you-tp23890390p23890390.html >> Sent from the POI - User mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > > -- View this message in context: http://www.nabble.com/Thank-you-tp23890390p24813687.html Sent from the POI - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
