Apologies for being blunt, but seeing that you're mixing string literals and UNICODE escape sequences, I have to ask: are you *sure* you're using the same character set when editing the .java file and when compiling it? I've had discrepancies when editing the java file in one encoding (say UTF-8), but the automated build system uses ISO 8859-1, and literal non-Latin characters get mangled, sans those written as UNICODE escape sequences, since those are in the ASCII range.
2017-01-19 14:17 GMT+02:00 Claudius Teodorescu <[email protected]>: > So, I found the private use Unicode code for a ligature, and displayed it in > a PDF document by using the code: > > pageContentStream.showText("त्त्व is correctly displayed with glyph > substitution as " + "\ue10d"); > > The result is in the attached file. > > So, it looks that what is needed is only the string to be rendered with all > the glyph substitution done. With this approach, the PDFBox is left > untouched. > > > Cheers from Heidelberg, > Claudius > > On Tue, Jan 17, 2017 at 8:55 AM, Tilman Hausherr <[email protected]> > wrote: >> >> Am 17.01.2017 um 07:32 schrieb Claudius Teodorescu: >>> >>> Well, I was just about to congratulate myself for fixing this with >>> PDFBox, >>> as FOP is returning good output, but with a character that is represented >>> in half. >>> >>> So, I guess I will need a text layout engine. What output of such engine >>> would be fit for PDFBox? >> >> >> In PDPageContentStream.showText there is this line: >> >> COSWriter.writeString(font.encode(text), getOutput()); >> >> So you need to get that sequence... might be tricky as above that line >> there's the subsetting that also needs the correct codes. This is not a >> change that will be done within a few hours. >> >> Tilman >> >> >> >>> >>> >>> Thanks, >>> Claudius >>> >>> On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr <[email protected]> >>> wrote: >>> >>>> Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu: >>>> >>>>> Its is not a big deal, but works for an awt component, but it is not >>>>> related to that: >>>>> >>>>> String s = "कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्"; >>>>> Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24); >>>>> FontRenderContext frc = new FontRenderContext(new >>>>> AffineTransform(), true, true); >>>>> >>>>> char[] chars = s.toCharArray(); >>>>> GlyphVector glyphVector = font2.layoutGlyphVector(frc, chars, >>>>> 0, >>>>> chars.length, 0);// createGlyphVector(frc, s); >>>>> >>>>> int length = glyphVector.getNumGlyphs(); >>>>> >>>>> for (int i = 0; i < length; i++) { >>>>> Shape glyph = glyphVector.getGlyphOutline(i); >>>>> System.out.println(glyphVector.getGlyphCode(i)); >>>>> } >>>>> >>>>> Any pointers about where I can hook this in PDFBox? >>>>> >>>> Problem is we don't use the awt fonts anymore. >>>> >>>> Tilman >>>> >>>> >>>> >>>> >>>>> Thanks, >>>>> Claudius >>>>> >>>>> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <[email protected]> >>>>> wrote: >>>>> >>>>> Hi, >>>>>> >>>>>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu: >>>>>> >>>>>> Hi, >>>>>>> >>>>>>> >>>>>>> Thanks for the answer, Tilman. >>>>>>> >>>>>>> I managed to get the Devanagari text exactly as it should, by using >>>>>>> java.awt.font.layoutGlyphVector(). >>>>>>> >>>>>>> Are they any chances to write a GlyphVector in a PDFBox page? >>>>>>> >>>>>>> There was a discussion at [1] about using GlpyhVector, but we didn't >>>>>> >>>>>> make >>>>>> any descision nor did we implement anything. >>>>>> >>>>>> Do you mimd to share some of your code as a possible starting point? >>>>>> >>>>>> BR >>>>>> Andreas >>>>>> >>>>>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550 >>>>>> >>>>>> >>>>>> Thanks, >>>>>>> >>>>>>> Claudius >>>>>>> >>>>>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr >>>>>>> <[email protected] >>>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>>> This is not supported, sorry. PDFBox just outputs the glyphs for the >>>>>>>> single characters and does not replace for ligatures. >>>>>>>> >>>>>>>> Tilman >>>>>>>> >>>>>>>> >>>>>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document >>>>>>>>> with >>>>>>>>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत् >>>>>>>>> युक्तम्. >>>>>>>>> >>>>>>>>> The code is very simple: >>>>>>>>> @Test >>>>>>>>> public void testPdfBox() throws IOException { >>>>>>>>> PDDocument document = new PDDocument(); >>>>>>>>> PDPage page = new PDPage(); >>>>>>>>> document.addPage(page); >>>>>>>>> >>>>>>>>> PDFont font = PDType0Font.load(document, >>>>>>>>> new File("/home/claudius/workspace >>>>>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf")); >>>>>>>>> >>>>>>>>> PDPageContentStream contentStream = new >>>>>>>>> PDPageContentStream(document, page); >>>>>>>>> >>>>>>>>> contentStream.beginText(); >>>>>>>>> contentStream.setFont(font, 12); >>>>>>>>> contentStream.moveTextPositionByAmount(100, 700); >>>>>>>>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्"); >>>>>>>>> contentStream.endText(); >>>>>>>>> >>>>>>>>> // Make sure that the content stream is closed: >>>>>>>>> contentStream.close(); >>>>>>>>> >>>>>>>>> // Save the results and ensure that the document is >>>>>>>>> properly >>>>>>>>> closed: >>>>>>>>> document.save("target/" + name.getMethodName() + ".pdf"); >>>>>>>>> document.close(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> The output pdf file (attached) is not rendering correctly the >>>>>>>>> string, >>>>>>>>> as >>>>>>>>> it is above. Namely, the ligatures are not displayed, as if they do >>>>>>>>> not >>>>>>>>> exist. On the other hand, if I am copying the text from the pdf >>>>>>>>> file, >>>>>>>>> and >>>>>>>>> paste it in eclipse, it shows perfectly. >>>>>>>>> >>>>>>>>> I checked the pdf output with evince, firefox, and adobe reader 9, >>>>>>>>> in >>>>>>>>> ubuntu. >>>>>>>>> >>>>>>>>> Any idea on how to fix this display issue? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Claudius >>>>>>>>> >>>>>>>>> -- >>>>>>>>> http://kuberam.ro >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>>> For additional commands, e-mail: [email protected] >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>> >>>>>> To unsubscribe, e-mail: [email protected] >>>>>> For additional commands, e-mail: [email protected] >>>>>> >>>>>> >>>>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>>> >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > > > -- > http://kuberam.ro > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

