[
https://issues.apache.org/jira/browse/PDFBOX-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289572#comment-14289572
]
John Hewson edited comment on PDFBOX-2618 at 1/23/15 5:39 PM:
--------------------------------------------------------------
Perhaps I didn't make myself clear: this issue is feature creep and should
either be closed or addressed by adding an example to PDFBox. The complexity of
laying out even a single line of text using Unicode and an OpenType font cannot
be understated - *entire* libraries are devoted to this task. We would need to
use at least ICU4J and some sort of HarfBuzz equivalent for Java (which doesn't
even exist - that's how hard it is to build). Even FOP doesn't get this right.
I don't want to see PDFBox turn from being a high-quality low-level PDF library
into a low-quality typesetting library. But this is exactly what will happen if
this issue is allowed to proceed. It's _guaranteed_ that there will be no end
to the JIRA issues opened once we add such a feature - when users can typeset
Unicode text, they come to depend on it and expect it not to fail when faced
with something complex or non-Western.
We need to rename this issue to better reflect what it's proposing, here are
the choices:
- A) *"Rebuild ICU4J and HarfBuzz ourselves"*
- B) *"Build half-baked fundamentally broken Western typesetting into PDFBox".*
C) Alternatively, we could write an example which uses either ICU4J or the
JDK's font handing (yes, this _will_ work - an is even slated to be replaced
with HarfBuzz in OpenJDK in a future release). It would be great to have such
an example!
A, B, or C?
was (Author: jahewson):
Perhaps I didn't make myself clear: this issue is feature creep and should
either be closed or addressed by adding an example to PDFBox. The complexity of
laying out even a single line of text using Unicode and an OpenType font cannot
be understated - *entire* libraries are devoted to this task. We would need to
use at least ICU4J and some sort of HarfBuzz equivalent for Java (which doesn't
even exist - that's how hard it is to build).
I don't want to see PDFBox turn from being a high-quality low-level PDF library
into a low-quality typesetting library. But this is exactly what will happen if
this issue is allowed to proceed. It's _guaranteed_ that there will be no end
to the JIRA issues opened once we add such a feature - when users can typeset
Unicode text, they come to depend on it and expect it not to fail when faced
with something complex or non-Western.
We need to rename this issue to better reflect what it's proposing, here are
the choices:
- A) *"Rebuild ICU4J and HarfBuzz ourselves"*
- B) *"Build half-baked fundamentally broken Western typesetting into PDFBox".*
C) Alternatively, we could write an example which uses either ICU4J or the
JDK's font handing (yes, this _will_ work - an is even slated to be replaced
with HarfBuzz in OpenJDK in a future release). It would be great to have such
an example!
A, B, or C?
> Create paragraphs with PDFBox
> -----------------------------
>
> Key: PDFBOX-2618
> URL: https://issues.apache.org/jira/browse/PDFBOX-2618
> Project: PDFBox
> Issue Type: Improvement
> Components: Writing
> Affects Versions: 2.0.0
> Reporter: Tilman Hausherr
>
> [~mkl] wrote this morning on stackoverflow on the topic about creating tables
> with PDFBox:
> {quote}I'm afraid all those samples IMO meely are proofs of concept, probably
> of use in limited use cases but by far not for generic use. PDFBox has its
> strengths, e.g. a quite versatile content extraction framework and a content
> rendering capability, but the absence a proper layouting API is a serious
> weakness.{quote}
> To which I answered:
> {quote}I know... I just don't want to create another iText. We're not the
> Samwer brothers.{quote}
> But he's right. We could of course look at what iText offers and implement
> that on our own, that wouldn't even be illegal, but it wouldn't be nice. I've
> never looked at or used iText, except once when answering this:
> http://stackoverflow.com/a/26820598/535646
> IMO what we need to start, is a method to write a paragraph to a PDF. Such a
> method would have these parameters:
> - text
> - rectangle (or width and height from current position)
> Such a method would then output the text and break the lines at the end of
> the rectangle, and throw an exception if the space isn't enough.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)