El sáb, 02-02-2008 a las 21:27 -0800, Michael Vrable escribió: > The root cause https://bugs.freedesktop.org/show_bug.cgi?id=12808 is > that the code for rendering form fields in poppler didn't properly deal > with input strings provided in UTF-16: the string was treated as an > 8-bit string, and the byte-order-mark at the front was included in the > length calculation. > > I started off trying to create a simple fix for this problem, but > eventually ended up significantly rewriting the code for displaying form > fields to fix other problems that I found, eventually working to add > near full support for Unicode inputs. > > Since these changes are large, I don't expect this patch to go in right > away. But please, provide feedback. My work in based on git commit > 6f11ef660540. > > There are two patches. The first, character-encoding-fixes.patch, is a > couple of fairly trivial fixes that I came across while working on the > larger patch. It can go in at any time if it looks good. > > The second patch, unicode-forms-support.patch, is the main part of the > work and the patch I'd like comments on. Most new functionality is in > the new Annot::layoutText function. It performs a few steps: > - Converts input in PDFDocEncoding or UTF-16 to the font's encoding > - Computes the width of the text on the page > - Optionally breaks the text at the specified width, for multi-line > form fields > All of this ended up in the same function since finding break-points for > lines is easiest to do on the input encoding, where spaces and newlines > are easier to recognize than in whatever encoding the font uses, but the > width of text is easiest to compute when re-encoding the text string. > > The main missing element for full Unicode handling is the writing out of > text for CID-keyed fonts. There is currently be support for taking > Unicode characters as input and finding the appropriate character code > in the font to show it. However, there isn't code for writing out the > correct sequence of bytes to show that character (doing so should be > trivial for an identity CMap, but isn't added quite yet). > > Also missing: support for Unicode text outside the BMP, using surrogate > pairs. > > I've done some limited testing with these patches (in evince), and it > definitely work better for me than before. However, I don't currently > have PDFs for testing many features, so pointers to any good test forms > are appreciated!
Hi Michael, thank you very much for the patches. I have tested them with several documents and it works pretty well. The only thing that it's still broken is multiline form fields. It was already broken indeed (see bug http://bugzilla.gnome.org/show_bug.cgi?id=499939) but in a different way. Now it seems to enter into an infinite loop after editing a multiline form field. You can use this file to reproduce the problem: http://www.okular.org/stuff/forms-scribus.pdf > Features tested: > - Accented characters; typographic characters such as bullets, quotes > - Left, center, right alignment of single-line fields > - Checkboxes work as before > - Single-line comb fields still work > Not tested: > - Multi-line fields (my test form doesn't have them) > - Form fields with composite fonts (no test forms; code still needs a > tiny bit of work) > > --Michael Vrable > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler -- Carlos Garcia Campos [EMAIL PROTECTED] [EMAIL PROTECTED] http://carlosgc.linups.org PGP key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x523E6462
signature.asc
Description: Esta parte del mensaje está firmada digitalmente
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
