Yes, I'm only interested in extracting the text (more specifically searching for different keywords in cv's in Word format).
Where can I find those JUnit testcases? (I'm new to this whole thing.) /Ylva On Fri, Mar 14, 2008 at 4:59 PM, Raghu Kaippully <[EMAIL PROTECTED]> wrote: > Are you just looking to extract text from word documents? Then HWPF probably > will do the trick. I am not familiar with Clean Content SDK so can't comment > on that. Why don't you give HWPF a try. Some of the JUnit testcases already > operate on extracting text, may be you can have a look at them. > > -Raghu > > On Fri, Mar 14, 2008 at 9:15 PM, Ylva Degerfeldt <[EMAIL PROTECTED]> > wrote: > > > > > Hi everyone, > > > > Maybe I shouldn't ask this on this mailing list but I'm about to start > > on a project where I'm going to extract different keywords from Word > > files in the most common formats (like 97 - 2003) and I'd like to know > > before I start if using POI-HWPF really is the best way to do that. > > > > The thing is.. I think I have found another way to do it: Oracle's > > Clean Content SDK. Has anyone tried this? I was just wondering if it's > > worth the time and effort to dig deeper into that or if I should > > simply decide that POI-HWPF is the best solution and forget about the > > other one. (I have a bit of a tight schedule so that's why I'm > > asking.) > > > > Thanks in advance, > > > > Ylva > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
