I have a solution I ry to attach the solution in this mail if it's not
attach for some security reason please let me informe and I sed you the
solution in your private Email adresse.
Add this jar on the classpath of your project and use it like this:
public String getMSWordContent(InputStream is) throws Exception {
String temp = null;
europarl.trad.sild.msword.extraction.WordExtractor wd = new
europarl.trad.sild.msword.extraction.WordExtractor();
try {
temp = wd.extractText(is);
} catch (Exception e) {
throw new Exception("getTextMiningContent: " +
e.toString());
}
return temp;
}
-----Original Message-----
From: chris.b [mailto:[EMAIL PROTECTED]
Sent: 27 November 2007 16:15
To: [email protected]
Subject: Re: Problem with word documents
here's a sample file that i wasn't able to index
http://www.nabble.com/file/p13972759/monte.doc monte.doc
thanks for the help :)
Rainer Schwarze wrote:
>
> chris.b wrote:
>> seen as it only happens with documents that are created and saved
using
>> open
>> office, could that be the problem?
>> i know that this part has nothing to do with poi, but should i in
that
>> case
>> try using the open office sdk to handle word documents?
>>
>> thank you
>>
>> Chris
>
> I just tried to open a OpenOffice generated Word file in my HWPF
version
> and it worked without problems. If you can send me a sample file, I
can
> take a look at it.
> Regarding the Open Office SDK: I never used it so I can't say much
about
> it - if it works for you, you may have less trouble reading all the
> different Word files. It depends a bit on how complex your files are
> (full blown fancy flyers made in Word or simple reports consisting of
> text mainly...).
>
> Best wishes, Rainer
> --
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
--
View this message in context:
http://www.nabble.com/Problem-with-word-documents-tf4877644.html#a139727
59
Sent from the POI - User mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]