https://bz.apache.org/bugzilla/show_bug.cgi?id=60975
Bug ID: 60975
Summary: Error converting doc with excel correspondence to html
Product: POI
Version: 3.16-dev
Hardware: PC
Status: NEW
Severity: critical
Priority: P2
Component: HWPF
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
Hi,
In this case I am trying to convert a .doc document into an html,
The particular thing is that the document word is product of making a
correspondence with data in a table of excel, that is to say, from word I use
the option of "correspondence" which allows me to bring values of some excel
table, when this happens, in the Word words are brought to perfection, but word
internally adds them MERGEFIELD {FIELD} VALUE.
The problem is that if to these words or sentences that I have in the word I
add an ENTER, when I try to convert to an html by means of
wordToHtmlConverter.processDocument (doc), this duplicates the words that are
after ENTER.
Example:
In the .doc document:
Phrase brought from
Excel
After the processDocument method:
Phrase brought from excel
Excel
processDocument->AbstractWordConverter->org.apache.poi.hwpf.converter->poi-scratchpad-3.8-beta4.jar
As a test to rule out that it is a problem that was solved with the future
versions, what I did was to update one by one each version until the last 3.16,
but the bug persists.
My code:
FileInputStream finStream=new FileInputStream(docFile.getAbsolutePath());
HWPFDocument doc=new HWPFDocument(finStream);
WordExtractor wordExtract=new WordExtractor(doc);
Document newDocument = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().newDocument();
WordToHtmlConverter wordToHtmlConverter = new
WordToHtmlConverter(newDocument) ;
wordToHtmlConverter.processDocument(doc);
StringWriter stringWriter = new StringWriter();
Transformer transformer =
TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
transformer.setOutputProperty(OutputKeys.METHOD, "html");
transformer.transform(new DOMSource(
wordToHtmlConverter.getDocument()), new StreamResult( stringWriter ) );
String html = stringWriter.toString();
Thanks.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]