DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=44431>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=44431

           Summary: HWPFDocument.write destroys fields
           Product: POI
           Version: unspecified
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HWPF
        AssignedTo: [email protected]
        ReportedBy: [EMAIL PROTECTED]


Trying to open and resave a Word document with

InputStream is = new FileInputStream("/home/esempio.doc");
HWPFDocument docInput = new HWPFDocument(is);
OutputStream os = new FileOutputStream("/home/TEST_POI.doc");
docInput.write(os);

all fields in document (TOC items, STYLEREF and so on) are destroyed and
converted to plain text; for example, a FILENAME field becomes "STYLEREF
TitoloDocumento \* MERGEFORMAT esempio.doc".

The problem may perhaps reside in control characters handling: in fact, fields
in MS Word are represented within normal text, as a sequence like

0x13 <field info> 0x14 <field value> 0x15

and text in POI saved document becomes

<field info> <field value>

The same problem affects also text extraction: a text portion like

File name is [esempio.doc]

in which "[esempio.doc]" represents a filename field, becomes

File name is STYLEREF TitoloDocumento \* MERGEFORMAT esempio.doc

in extracted text.
I've partially solved this latter issue using the Java method (s is the text
portion to clean)

private static String rimuoviCampi(String s) {
        s = s.replaceAll("\\x13[^\\x13\\x14]*\\x14", "");
        s = s.replaceAll("\\x15", "");
        s = s.trim();
        return s;
}

but it remains unsolved in document saving.

Thanks in advance

Domenico

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to