synfin wrote:
Hello,

I'm working on a program that simply extracts the body of text from
.odt and .doc files as a String. The snippet of code below either
prints the entire document or simply a blank line, which seems to be
dependent on the length of the document. That is, after passing a
certain length (44 pages), the program will no longer output anything
besides a blank line. Could anyone please lend me some insight? I'm
also open to other (better) ways to extract a simple string from any
document. Thank you.

Kind regards,
Josh

Josh,

For questions like this concerning the OOo API I guess you can find more help over at [EMAIL PROTECTED] (One thing that comes to mind is that much of the OOo code internally uses a string type (C++ tools String) that can only handle strings of less than 64K characters. That might be the reason you get an empty string for a long document.)

-Stephan

P.S. How can I enable debugging output from soffice.bin?

<CODE>

       XComponent openDocument = openDocument(documentProperties);

       // Query its XTextDocument interface to get the text
       scrapeDoc = (XTextDocument)UnoRuntime.queryInterface(
           XTextDocument.class, openDocument);


      XText scrapeText = scrapeDoc.getText();

       // Print body of text to stdout
       System.out.println(scrapeText.getString());

</CODE>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to