> On 29 Mar 2016, at 23:00, Jerry <[email protected]> wrote: > > I have written an application that generates an .epub document from user > input. > > I am now trying to use PdfBox to add PDF output of the same source text. But > I have encountered problems when trying to render bold or italic text: > > - In the italic font, the characters u and i in the word "quick" are > overlapped. > > - In the word-pair "brown fox" (where "brown" is in plain font and "fox" is > italic) there is no space between the words but there is an extra space > between the f and o in "fox". > > - In the phrase "dog and ran" (which is bold) the single space between "and" > and "ran" is too wide, and there is no space following "ran" and the next > word. > > And yet, the same string is rendered with correct spacing when output as > plain text (no font changes). > > See the output files at: > > https://www.dropbox.com/s/ox4arbrfiv5jqfu/withNoHtmlTags.pdf?dl=0 > https://www.dropbox.com/s/wgj029hm4wre1x5/withItalicsAndBoldFonts.pdf?dl=0 > > As a newbie to both PDF and PdfBox, I started with a tutorial I found at > http://www.coderanch.com/t/659953/Wiki/PDFBox. Once I verified that I had > entered the tutorial correctly by running it and viewing the output, I began > experimenting by displaying a simple test string that is long enough to > require word wrapping. When I got that to work, I tried adding bold and > italic HTML tags to the string (since the end goal is to create PDF from > .epub source). > > Here is my test code: > > https://www.dropbox.com/s/k9d22s0xsgg8tz8/TestBed.java?dl=0 > > In TestBed.java, doTutorial() is the unmodified tutorial. > > The method doMyCode() displays the test string by breaking it into individual > whole words. If I mark words with <i> and <b> tags, they are correctly > rendered with bold and italic fonts. But this limits font changes to whole > words only, which rules out a font change in the middle of a string of > characters. To handle that I need to output individual characters, not words.
Do you really need to handle that? Changing fonts mid-word is generally not a done thing. -- John > The method doMyCode2() displays the test string word by word unless the word > contains an HTML tag, then text is rendered character by character. > If the test string contains no tags, it renders correctly. > > See the sample file withNoHtmlTags.pdf. > > When <i> and <b> tags are encountered, fonts get changed to > PDType1Font.TIMES_BOLD or PDType1Font.TIMES_ITALIC as required, and the > string is rendered, but the character spacing is mangled. > > See the sample file withItalicsAndBoldFonts.pdf. > > Both of these files were generated by the same code---the doMyCode2() > method---with the only change being the addition or subtraction of <i> and > <b> tags to the string paraText. > > It does not appear to be a font problem, rather a rendering problem. I get > the same (well, nearly the same) results with both Times and Helvetica---the > "nearly the same" being the positioning of the u and I characters in the word > "quick"---still overlapping, but in the Helvetica rendering, the i is in the > middle of the u while in the Times rendering, the i overlaps the last stroke > of the u so that it looks like a u with a dot over its tail. > > What can I do to fix this? > > Thanks. > > Jerry > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

