Are you just looking to extract text from word documents? Then HWPF probably
will do the trick. I am not familiar with Clean Content SDK so can't comment
on that. Why don't you give HWPF a try. Some of the JUnit testcases already
operate on extracting text, may be you can have a look at them.

-Raghu

On Fri, Mar 14, 2008 at 9:15 PM, Ylva Degerfeldt <[EMAIL PROTECTED]>
wrote:

> Hi everyone,
>
> Maybe I shouldn't ask this on this mailing list but I'm about to start
> on a project where I'm going to extract different keywords from Word
> files in the most common formats (like 97 - 2003) and I'd like to know
> before I start if using POI-HWPF really is the best way to do that.
>
> The thing is.. I think I have found another way to do it: Oracle's
> Clean Content SDK. Has anyone tried this? I was just wondering if it's
> worth the time and effort to dig deeper into that or if I should
> simply decide that POI-HWPF is the best solution and forget about the
> other one. (I have a bit of a tight schedule so that's why I'm
> asking.)
>
> Thanks in advance,
>
> Ylva
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to