On Thu, Apr 25, 2013 at 1:08 PM, Vincent Hennebert <vhenneb...@gmail.com>wrote:

> On 25/04/13 17:48, Glenn Adams wrote:
> > On Thu, Apr 25, 2013 at 2:31 AM, Vincent Hennebert <vhenneb...@gmail.com
> >wrote:
> >
> >>
> >> It doesn’t shock me to store text as text in the IF and to re-do the
> >> glyph mapping when rendering it to the final output format. This is
> >> actually how it is done ATM.
> >>
> >
> > I think this a bad idea for the reasons that Alexios mentioned, and that
> I
> > previously mentioned about recreating sufficient layout context to repeat
> > the process reliably.
>
> What exactly do you mean by ‘sufficient layout context’? What would be
> missing from the IF that would prevent to re-do the glyph mapping?
>

Off hand, we would need:

   - language
   - script
   - font features to be applied (with parameters)
   - letter-spacing settings

There are probably others. I just don't see any reason to use this approach.


>
>
> >> Sure it may become more costly when you start using complex scripts,
> >> but
> >> that would have to be confirmed with some profiling first and foremost.
> >> We might be surprised.
> >>
> >> We should keep in mind that it’s a perfectly reasonable use case to add
> >> text to the IF as part of a post-processing step. That text will have to
> >> go through the glyph mapping code anyway.
> >>
> >> Also, to have copy-paste work properly from PDF the original text must
> >> be present in the IF.
> >>
> >
> > Agreed, but this is a different requirement. And doesn't entail
> > reconstructing part of the layout context and repeating the character to
> > glyph mapping and positioning process.
>
> You’ll have to do that for text added at post-process time anyway?
>

I don't understand what this means.


>
>
> >> Storing information about the private use area in the IF is exposing
> >> internal implementation details of FOP.
> >
> >
> > I disagree. In fact, it is working around a bug that exists in certain
> > fonts which forces FOP to make use of synthesized PUA mappings. The bug
> is
> > that the font designer did not fully populate the original CMAP, i.e.,
> > include a mapping for every accessible glyph.
>
> I still don’t get it I’m afraid. Where in the TrueType spec is it stated
> that every glyph should have an entry in the cmap?


It doesn't. But if someone uses a font, wants to present a glyph that has
no mapping, and must use character codes, then it won't work.


> Why can’t FOP just
> use the glyph ID? Surely that information is enough?
>

Well, for one thing, the IF interface for renderText uses a character
string, not a glyph index string, and the IF XML format uses Unicode code
points.


>
>
> >> When going the direct FO to PDF
> >> route, mapping glyphs to character codes to re-map them again into
> >> glyphs when creating the PDF is sub-optimal. We might as well work with
> >> the glyph indices all the way through.
> >>
> >
> > This is possible, but wouldn't it require two separate paths through the
> IF
> > layer, and would it not work for non-PDF output?
>
> I don’t think so. The original text should be passed through anyway to
> create the ToUnicode cmap.


Why?


> So PDF can use the glyph mapping to generate
> the text operators and the original text for the ToUnicode cmap. The IF
> renderer just streams out the original text. And the other renderers
> just deal with the glyph mapping.
>

Since the technique I suggests will work and does not require this, then
this (repeating the character to glyph mapping, positioning, and layout
process) isn't necessary. I have agreed, however, that embedding the
original UC text for performing copy and find operations will be useful,
for which there is already an open bug [1].

[1] https://issues.apache.org/jira/browse/FOP-2204


>
>
> Vincent
>
>
> > I suspect this falls under
> > the category of "premature optimization", on which Knuth says "Premature
> > optimization is the root of all evil (or at least most of it) in
> > programming."
> >
> >
> >>
> >>
> >> Vincent
> >>
> >>
> >>> On 25 Apr 2013, at 01:52, Glenn Adams <gl...@skynav.com> wrote:
> >>>
> >>>> I see no option but to modify IF. We modified IF for 1.1 in the first
> >> place.  We have recently made quite a number of backward incompatible
> >> changes to the FOP public APIs. I expect the next release will need to
> bump
> >> the major version to 2 for FOP due to these changes, so there is little
> >> risk in making a change in IF. If there are other, useful changes to IF
> >> that have been postponed, then perhaps they should be reconsidered now
> as
> >> well.
> >>>>
> >>>>
> >>>> On Wed, Apr 24, 2013 at 3:26 PM, Luis Bernardo <
> lmpmberna...@gmail.com>
> >> wrote:
> >>>>
> >>>> These are good suggestions. I am fully aware of the shortcomings that
> >> you pointed out, but the only other option seemed to be to codify the
> >> mappings in IF, similar to your first suggestion. However that would
> mean
> >> changing IF which is not something we are keen to do since that impacts
> >> applications that rely on the current format.
> >>>>
> >>>> Are you saying that with your second approach there is no need to
> >> change IF?
> >>>>
> >>>>
> >>>> On 4/24/13 7:38 PM, Glenn Adams wrote:
> >>>>> Sure. One way to do this would be to add child elements to the
> <font/>
> >> element in IF output as follows:
> >>>>>
> >>>>> <font family="Lateef" style="normal" ...>
> >>>>>   <pua code="0xE000" gid="139"/>
> >>>>>   <pua code="0xE001" gid="481"/>
> >>>>>   <pua code="0xE002" gid="219"/>
> >>>>> </font>
> >>>>>
> >>>>> where these PUA mappings are collected by iterating over the
> >> characters of TextAreas governed by the <font/> element. These
> characters
> >> might be iterated upon invoking TextArea.add{Word,Space}, and collecting
> >> this info in text areas.
> >>>>>
> >>>>> Alternatively, MultiByteFont.getUsedGlyphs() could be used to (1)
> >> determine which glyph codes were referenced by the document, (2) given
> >> these used codes, iterate of the the CMAP mappings to find which PUA
> codes
> >> were generated for those glyph codes, then (3) output the <pua/>
> elements
> >> (above) as required.
> >>>>>
> >>>>> Finally, when reading an IF file, these <pua/> elements would be used
> >> to augment the font's CMAP (keeping in mind that when reading the font,
> >> MultiByteFont.createPrivateUseMappings() may have already been called,
> and
> >> thus the mappings in <pua/> elements may need to be replaced or merged.
> >>>>>
> >>>>> I can imagine various other optimizations on the above theme to make
> >> this readily workable.
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Apr 24, 2013 at 3:18 AM, Chris Bowditch <
> >> bowditch_ch...@hotmail.com> wrote:
> >>>>> Hi Glenn,
> >>>>>
> >>>>> Can you suggest an alternative approach please?
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Chris
> >>>>>
> >>>>>
> >>>>> On 24/04/2013 02:41, Glenn Adams wrote:
> >>>>> I don't like this. It negates any additional processing that may have
> >> occurred, such as letter spacing. It requires the IF to repeat part of
> the
> >> layout process. Bad idea.
> >>>>>
> >>>>>
> >>>>> On Tue, Apr 23, 2013 at 3:11 PM, Luis Bernardo <
> lmpmberna...@gmail.com<mailto:
> >> lmpmberna...@gmail.com>> wrote:
> >>>>>
> >>>>>
> >>>>>     With the approach implemented by Simon what gets written to the
> IF
> >>>>>     file is the original sequence, not the mapped sequence. Then when
> >>>>>     generating PDF from IF the same code that would generate the
> >>>>>     synthesized mappings when generating PDF straight from FO is
> >>>>>     called to recreate the mappings. So I don't think we can say
> there
> >>>>>     is information about the mappings in the text nodes.
> >>>>>
> >>>>>
> >>>>>     On 4/23/13 5:50 AM, Glenn Adams wrote:
> >>>>>     Ah, I reread your earlier (private) message. I see the problem
> >>>>>     has to do with the use of synthesized PUA mappings. Here, the
> >>>>>     problem really is that the font should always have a CMAP entry
> >>>>>     that maps to every glyph that can be produced by the GSUB
> >>>>>     process. However, not all fonts do this, so in the case in point,
> >>>>>     we have to synthesize some mapping, from which we have to turn to
> >>>>>     PUA assignments. This works when we generate PDF since we
> >>>>>     generate a subset font that contains the synthesized mappings.
> >>>>>     However, I can see that if this is going to IF instead of PDF/PS,
> >>>>>     then we need to find a way to recreate those synthesized
> mappings.
> >>>>>
> >>>>>     I think this information is really font-specific, and should not
> >>>>>     be tied to specific text nodes though. So if Simon's fix uses
> >>>>>     text nodes, then that is probably not the best approach.
> >>>>>
> >>>>>
> >>>>>     On Mon, Apr 22, 2013 at 10:45 PM, Glenn Adams <gl...@skynav.com
> >>>>>     <mailto:gl...@skynav.com>> wrote:
> >>>>>
> >>>>>         I'm presently at W3C WG meetings this week, but I'll try to
> >>>>>         get on my schedule. I'm not sure what the IF->PS/PDF problem
> >>>>>         is, since the IF->PDF path is clearly working from my tests.
> >>>>>
> >>>>>
> >>>>>         On Mon, Apr 22, 2013 at 4:27 PM, Luis Bernardo
> >>>>>         <lmpmberna...@gmail.com <mailto:lmpmberna...@gmail.com>>
> >> wrote:
> >>>>>
> >>>>>
> >>>>>             Glenn,
> >>>>>
> >>>>>             Can you give your opinion about the approach used by
> >>>>>             Simon? As I mentioned before (in a private message), the
> >>>>>             IF -> PS/PDF route does not work in your original CS
> >>>>>             patch (for the languages that CS targets) due to the
> >>>>>             mapped sequences. Simon's approach works but requires
> >>>>>             keeping the original sequences alongside the mapped ones.
> >>>>>             I think it is a good approach but I would like to know if
> >>>>>             you have a better suggestion before we apply the patch.
> >>>>>
> >>>>>             Thanks,
> >>>>>             Luis
> >>>>>
> >>>>>
> >>>>>             On 4/22/13 3:23 PM, Chris Bowditch (JIRA) wrote:
> >>>>>
> >>>>>                 [
> >>>>>
> >>
> https://issues.apache.org/jira/browse/FOP-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> >>>>>                 ]
> >>>>>
> >>>>>                 Chris Bowditch reassigned FOP-2210:
> >>>>>                 -----------------------------------
> >>>>>
> >>>>>                      Assignee: Chris Bowditch
> >>>>>
> >>>>>                     [PATCH] Complex script IF to output missing
> glyphs
> >>>>>
> --------------------------------------------------
> >>>>>
> >>>>>                                      Key: FOP-2210
> >>>>>                                      URL:
> >>>>>                     https://issues.apache.org/jira/browse/FOP-2210
> >>>>>                                  Project: Fop
> >>>>>                               Issue Type: Bug
> >>>>>                                 Reporter: simon steiner
> >>>>>                                 Assignee: Chris Bowditch
> >>>>>                              Attachments: csspeedtrunk.patch,
> >>>>>                     fop.xconf, test.fo <http://test.fo>
> >>>>>
> >>>>>
> >>>>>                     fop test.fo <http://test.fo> -c fop.xconf -if
> >>>>>
> >>>>>                     application/pdf expected.if.xml
> >>>>>                     fop -c fop.xconf -ifin expected.if.xml out.pdf
> >>>>>
> >>>>>                 --
> >>>>>                 This message is automatically generated by JIRA.
> >>>>>                 If you think it was sent incorrectly, please contact
> >>>>>                 your JIRA administrators
> >>>>>                 For more information on JIRA, see:
> >>>>>                 http://www.atlassian.com/software/jira
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>

Reply via email to