https://issues.apache.org/bugzilla/show_bug.cgi?id=50972
Summary: XWPFWordExtractor ignores <w:br/> entries
Product: POI
Version: 3.8-dev
Platform: PC
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: XWPF
AssignedTo: [email protected]
ReportedBy: [email protected]
Created an attachment (id=26797)
--> (https://issues.apache.org/bugzilla/attachment.cgi?id=26797)
Test document
Two words separated by a line break character are glued together.
I tried to debug the issue and found a code in XWPFRun.toString() method:
if (o instanceof CTEmpty) {
// Some inline text elements get returned not as
// themselves, but as CTEmpty, owing to some odd
// definitions around line 5642 of the XSDs
String tagName = o.getDomNode().getNodeName();
if ("w:tab".equals(tagName)) {
text.append("\t");
}
if ("w:br".equals(tagName)) {
text.append("\n");
}
<...>
}
The issue is that "o" is an instance of CTBrImpl, not CTEmpty. So this element
is ignored.
Attached a test document.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]