All the HWPF code is under the scratchpad section in subversion repository - http://svn.apache.org/viewvc/poi/trunk/src/scratchpad/
For a simple example, have a look at testcases/org/apache/poi/hwpf/usermodel/TestProblems.java under the scratchpad. It has a method testRangeDelete() that scans the text pieces. -Raghu On Fri, Mar 14, 2008 at 9:46 PM, Ylva Degerfeldt <[EMAIL PROTECTED]> wrote: > Yes, I'm only interested in extracting the text (more specifically > searching for different keywords in cv's in Word format). > > Where can I find those JUnit testcases? (I'm new to this whole thing.) > > /Ylva > > On Fri, Mar 14, 2008 at 4:59 PM, Raghu Kaippully <[EMAIL PROTECTED]> > wrote: > > Are you just looking to extract text from word documents? Then HWPF > probably > > will do the trick. I am not familiar with Clean Content SDK so can't > comment > > on that. Why don't you give HWPF a try. Some of the JUnit testcases > already > > operate on extracting text, may be you can have a look at them. > > > > -Raghu > > > > On Fri, Mar 14, 2008 at 9:15 PM, Ylva Degerfeldt < > [EMAIL PROTECTED]> > > wrote: > > > > > > > > > Hi everyone, > > > > > > Maybe I shouldn't ask this on this mailing list but I'm about to > start > > > on a project where I'm going to extract different keywords from Word > > > files in the most common formats (like 97 - 2003) and I'd like to > know > > > before I start if using POI-HWPF really is the best way to do that. > > > > > > The thing is.. I think I have found another way to do it: Oracle's > > > Clean Content SDK. Has anyone tried this? I was just wondering if > it's > > > worth the time and effort to dig deeper into that or if I should > > > simply decide that POI-HWPF is the best solution and forget about the > > > other one. (I have a bit of a tight schedule so that's why I'm > > > asking.) > > > > > > Thanks in advance, > > > > > > Ylva > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
