[jira] [Resolved] (FOP-3139) Add support for font-selection-strategy=character-by-character
[ https://issues.apache.org/jira/browse/FOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Steiner resolved FOP-3139. Fix Version/s: main Resolution: Fixed https://github.com/apache/xmlgraphics-fop/commit/b16022ece329197f72f47943085d45b56e26806e > Add support for font-selection-strategy=character-by-character > -- > > Key: FOP-3139 > URL: https://issues.apache.org/jira/browse/FOP-3139 > Project: FOP > Issue Type: Bug >Reporter: Simon Steiner >Assignee: Simon Steiner >Priority: Major > Fix For: main > > Attachments: test2.fo > > > fop test.fo out.pdf > All glyphs should map to a font -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FOP-3139) Add support for font-selection-strategy=character-by-character
[ https://issues.apache.org/jira/browse/FOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Steiner updated FOP-3139: --- Description: fop test.fo out.pdf All glyphs should map to a font > Add support for font-selection-strategy=character-by-character > -- > > Key: FOP-3139 > URL: https://issues.apache.org/jira/browse/FOP-3139 > Project: FOP > Issue Type: Bug >Reporter: Simon Steiner >Assignee: Simon Steiner >Priority: Major > Attachments: test2.fo > > > fop test.fo out.pdf > All glyphs should map to a font -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FOP-3139) Add support for font-selection-strategy=character-by-character
[ https://issues.apache.org/jira/browse/FOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Steiner updated FOP-3139: --- Attachment: test2.fo > Add support for font-selection-strategy=character-by-character > -- > > Key: FOP-3139 > URL: https://issues.apache.org/jira/browse/FOP-3139 > Project: FOP > Issue Type: Bug >Reporter: Simon Steiner >Assignee: Simon Steiner >Priority: Major > Attachments: test2.fo > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FOP-3139) Add support for font-selection-strategy=character-by-character
Simon Steiner created FOP-3139: -- Summary: Add support for font-selection-strategy=character-by-character Key: FOP-3139 URL: https://issues.apache.org/jira/browse/FOP-3139 Project: FOP Issue Type: Bug Reporter: Simon Steiner Assignee: Simon Steiner -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: font selection by character
Great, thanks for looking into it! AFAIU, your concept is right. The spec refers to CSS on how to handly the font selection (see [1]). But I also think that's the minimal algorithm. If you have two Symbol characters and a space between, it's probably slightly better-looking to use Symbol's space character, i.e. no font-family change for only white-space. But I'd stick to what's easier to implement, first. CSS also defines a font-matching concept which, AFAIU, kicks in when mere font-selection doesn't help anymore. But IMO, that's another building site which shouldn't be too difficult to add once we have the basic infrastructure for char-by-char font selection. [1] http://www.w3.org/TR/REC-CSS2/fonts.html#propdef-font-family [2] http://www.w3.org/TR/REC-CSS2/fonts.html#algorithm Jeremias Maerki On 19.10.2007 13:50:55 Manuel Mall wrote: Because the issue of font selection by character was raised again on fop-user I have started to look at it as the appropriate font data structures are now in place. I am just looking for a confirmation here of the selection algorithm. My understanding is that for each character, independent of its context, the font list is searched for the first font with a matching glyph. This means in particular for characters like space and hyphen (as used for hyphenation), they will always be taken from the first font containing them even if their 'neighbouring' characters are from a different font? Manuel
font selection by character
Because the issue of font selection by character was raised again on fop-user I have started to look at it as the appropriate font data structures are now in place. I am just looking for a confirmation here of the selection algorithm. My understanding is that for each character, independent of its context, the font list is searched for the first font with a matching glyph. This means in particular for characters like space and hyphen (as used for hyphenation), they will always be taken from the first font containing them even if their 'neighbouring' characters are from a different font? Manuel
Re: font selection by character
On Jul 25, 2007, at 19:48, Manuel Mall wrote: Manuel, Just a little heads-up: I've already made the discussed change to CommonFont locally, but since I had already made a few other alterations to this class and this particular change is only minimal, they will be in the same commit (which should be in roughly 24 hours if no objections arise to my cache proposal). Cheers Andreas On Thursday 26 July 2007 00:23, Andreas L Delmelle wrote: snip / IOW, the code in the LM could come to look like: //init() fontkeys = fobj.getCommonFont().getFontState (fobj.getFOEventHandler().getFontInfo()); //instantiate the fonts, one ... initfont = fobj.getFOEventHandler().getFontInfo.getFontInstance (fontkeys[0], this); ... //other_method() //... by one fallbackfont = fobj.getFOEventHandler().getFontInfo ().getFontInstance(fontkeys[x], this); //or if desired, create them all at once //preferably not used for large sets of unloaded fonts (?) allfonts = fobj.getFOEventHandler().getFontInfo().getFontInstances (fontkeys, this); WDYT? Yes, that will work I think. Thanks Manuel Thanks for the feedback. Cheers Andreas
Re: font selection by character
On Jul 25, 2007, at 02:25, Manuel Mall wrote: On Wednesday 25 July 2007 05:17, Andreas L Delmelle wrote: Anyone have any objections, for instance, to change the default behaviour of FontInfo.fontLookup(String[], ...) to something like the below code? Should we add a getFontInstances() method too, or rather, since the below method is only used by CommonFont, maybe merge it into such a getFontInstances() method, that would return Font[]? To Manuel in particular: what would be most convenient from your point of view? Is it OK if each LM gets a set of Fonts (or mere triplet-keys), instead of only one, upon the call to getFontState()? LMs tend to call these methods only once typically in their initialize method. A quick search shows that getCommonFont() is called only 8 times in the LMs and in each case the call is the same: font = fobj.getCommonFont().getFontState(fobj.getFOEventHandler ().getFontInfo(), this); In summary LMs are not really interested in the CommonFont object as such. They only use it to get to the Font object (or now we want a list of Font objects) applicable to them. But the LMs would like a Font object not a Triplet. OK, just wondering because the FontInfo instance that CommonFont uses to do the lookup ultimately gets passed in by the LM. Maybe in some cases it could be a more efficient use of resources to have CommonFont.getFontState() return only the compact triplet-objects in an array, which the LM can then later feed to the getFontInstance() method of the FontInfo it has access to. If I remember correctly from studying the fonts package, getFontInstance() would also actually *load* the font if that is not yet done at the point the method is called. If a user were to specify 20 font-families, they would all be loaded even when only one of them is actually used. Can't remember exactly if Adrian's auto-detection patch changed this in any way. So, IMO the least impact change is to simply add font[] getFontStates(...) to CommonFont. How it is implemented internally doesn't really worry me. We can even make it a bit more flexible, I think. Make it FontTriplet[] as a return type, and add the getFontInstances (FontTriplet[],...) method to FontInfo. That way, the LM can ask all Font instances at once (if it wants/ needs to), or just the first one to initialize from and create the others only if necessary. IOW, the code in the LM could come to look like: //init() fontkeys = fobj.getCommonFont().getFontState (fobj.getFOEventHandler().getFontInfo()); //instantiate the fonts, one ... initfont = fobj.getFOEventHandler().getFontInfo.getFontInstance (fontkeys[0], this); ... //other_method() //... by one fallbackfont = fobj.getFOEventHandler().getFontInfo ().getFontInstance(fontkeys[x], this); //or if desired, create them all at once //preferably not used for large sets of unloaded fonts (?) allfonts = fobj.getFOEventHandler().getFontInfo().getFontInstances (fontkeys, this); WDYT? Thanks for the feedback. Cheers Andreas
Re: font selection by character
On Thursday 26 July 2007 00:23, Andreas L Delmelle wrote: On Jul 25, 2007, at 02:25, Manuel Mall wrote: On Wednesday 25 July 2007 05:17, Andreas L Delmelle wrote: Anyone have any objections, for instance, to change the default behaviour of FontInfo.fontLookup(String[], ...) to something like the below code? Should we add a getFontInstances() method too, or rather, since the below method is only used by CommonFont, maybe merge it into such a getFontInstances() method, that would return Font[]? To Manuel in particular: what would be most convenient from your point of view? Is it OK if each LM gets a set of Fonts (or mere triplet-keys), instead of only one, upon the call to getFontState()? LMs tend to call these methods only once typically in their initialize method. A quick search shows that getCommonFont() is called only 8 times in the LMs and in each case the call is the same: font = fobj.getCommonFont().getFontState(fobj.getFOEventHandler ().getFontInfo(), this); In summary LMs are not really interested in the CommonFont object as such. They only use it to get to the Font object (or now we want a list of Font objects) applicable to them. But the LMs would like a Font object not a Triplet. OK, just wondering because the FontInfo instance that CommonFont uses to do the lookup ultimately gets passed in by the LM. Maybe in some cases it could be a more efficient use of resources to have CommonFont.getFontState() return only the compact triplet-objects in an array, which the LM can then later feed to the getFontInstance() method of the FontInfo it has access to. If I remember correctly from studying the fonts package, getFontInstance() would also actually *load* the font if that is not yet done at the point the method is called. If a user were to specify 20 font-families, they would all be loaded even when only one of them is actually used. Can't remember exactly if Adrian's auto-detection patch changed this in any way. So, IMO the least impact change is to simply add font[] getFontStates(...) to CommonFont. How it is implemented internally doesn't really worry me. We can even make it a bit more flexible, I think. Make it FontTriplet[] as a return type, and add the getFontInstances (FontTriplet[],...) method to FontInfo. That way, the LM can ask all Font instances at once (if it wants/ needs to), or just the first one to initialize from and create the others only if necessary. IOW, the code in the LM could come to look like: //init() fontkeys = fobj.getCommonFont().getFontState (fobj.getFOEventHandler().getFontInfo()); //instantiate the fonts, one ... initfont = fobj.getFOEventHandler().getFontInfo.getFontInstance (fontkeys[0], this); ... //other_method() //... by one fallbackfont = fobj.getFOEventHandler().getFontInfo ().getFontInstance(fontkeys[x], this); //or if desired, create them all at once //preferably not used for large sets of unloaded fonts (?) allfonts = fobj.getFOEventHandler().getFontInfo().getFontInstances (fontkeys, this); WDYT? Yes, that will work I think. Thanks Manuel Thanks for the feedback. Cheers Andreas
Re: font selection by character
On Jul 21, 2007, at 14:07, Andreas L Delmelle wrote: snip / As to then further addressing the font-family fallback/font- selection issue, ... snip / Change the getFontState() signature: - either make it public Font[] getFontStates() - or add an extra char parameter to getFontState(), that would allow CommonFont to seek a Font in its list that contains a mapping for the given char Which strategy is preferred? I'm even thinking that maybe it would even be better design not to store/create Font instances in CommonFont at all, and instead of doing so, *always* forward the font-lookup to FontInfo (not only the first time getFontState() is called, as is the case now). Anyone have any objections, for instance, to change the default behaviour of FontInfo.fontLookup(String[], ...) to something like the below code? Should we add a getFontInstances() method too, or rather, since the below method is only used by CommonFont, maybe merge it into such a getFontInstances() method, that would return Font[]? To Manuel in particular: what would be most convenient from your point of view? Is it OK if each LM gets a set of Fonts (or mere triplet-keys), instead of only one, upon the call to getFontState()? Cheers Andreas -- sample FontInfo.fontLookup() -- /** * Looks up (a set of) font(s). * br * Locate the font name(s) for the given families, style and weight. * The font name(s) can then be used as a key as they are unique for * the associated document. * This also adds the font(s) to the list of used fonts. * @param families font families (priority list) * @param style font style * @param weightfont weight * @return the set of font triplets of the supported font-families * in the specified style and weight. */ public FontTriplet[] fontLookup(String[] families, String style, int weight) { FontTriplet triplet; List tmpTriplets = new ArrayList(families.length); for (int i = 0; i families.length; i++) { triplet = fontLookup(families[i], style, weight, (i = families.length - 1)); if (triplet != null) { tmpTriplets.add(triplet); } } if (tmpTriplets.size() != 0) { FontTriplet[] triplets = (FontTriplet[]) tmpTriplets.toArray(); return triplets; } throw new IllegalStateException( fontLookup must return an array with at least one + FontTriplet on the last call.); } --
Re: font selection by character
On Wednesday 25 July 2007 05:17, Andreas L Delmelle wrote: On Jul 21, 2007, at 14:07, Andreas L Delmelle wrote: snip / As to then further addressing the font-family fallback/font- selection issue, ... snip / Change the getFontState() signature: - either make it public Font[] getFontStates() - or add an extra char parameter to getFontState(), that would allow CommonFont to seek a Font in its list that contains a mapping for the given char Which strategy is preferred? I'm even thinking that maybe it would even be better design not to store/create Font instances in CommonFont at all, and instead of doing so, *always* forward the font-lookup to FontInfo (not only the first time getFontState() is called, as is the case now). Anyone have any objections, for instance, to change the default behaviour of FontInfo.fontLookup(String[], ...) to something like the below code? Should we add a getFontInstances() method too, or rather, since the below method is only used by CommonFont, maybe merge it into such a getFontInstances() method, that would return Font[]? To Manuel in particular: what would be most convenient from your point of view? Is it OK if each LM gets a set of Fonts (or mere triplet-keys), instead of only one, upon the call to getFontState()? LMs tend to call these methods only once typically in their initialize method. A quick search shows that getCommonFont() is called only 8 times in the LMs and in each case the call is the same: font = fobj.getCommonFont().getFontState(fobj.getFOEventHandler().getFontInfo(), this); In summary LMs are not really interested in the CommonFont object as such. They only use it to get to the Font object (or now we want a list of Font objects) applicable to them. But the LMs would like a Font object not a Triplet. So, IMO the least impact change is to simply add font[] getFontStates(...) to CommonFont. How it is implemented internally doesn't really worry me. Manuel Cheers Andreas -- sample FontInfo.fontLookup() -- /** * Looks up (a set of) font(s). * br * Locate the font name(s) for the given families, style and weight. * The font name(s) can then be used as a key as they are unique for * the associated document. * This also adds the font(s) to the list of used fonts. * @param families font families (priority list) * @param style font style * @param weightfont weight * @return the set of font triplets of the supported font-families * in the specified style and weight. */ public FontTriplet[] fontLookup(String[] families, String style, int weight) { FontTriplet triplet; List tmpTriplets = new ArrayList(families.length); for (int i = 0; i families.length; i++) { triplet = fontLookup(families[i], style, weight, (i = families.length - 1)); if (triplet != null) { tmpTriplets.add(triplet); } } if (tmpTriplets.size() != 0) { FontTriplet[] triplets = (FontTriplet[]) tmpTriplets.toArray(); return triplets; } throw new IllegalStateException( fontLookup must return an array with at least one + FontTriplet on the last call.); } --
Re: font selection by character
On Jul 20, 2007, at 11:56, Andreas L Delmelle wrote: On Jul 20, 2007, at 08:02, Manuel Mall wrote: On Jul 20, 2007, at 05:47, Manuel Mall wrote: snip/ As to how the TextLM should then further handle it, I hadn't really looked deeper into so far, but it seems like you have... 8-) Andreas, how about this as a way forward: You implement the font lists in the Property system / FO Tree and I deal with the implementation of the font selection based on char, Knuth box and area creation logic in TextLM using the font data structures you come up with? Good idea. It should not take me too much time to finish that. Something for the weekend, maybe. I've looked a bit closer, and bumped into something I would like to address as well concerning fonts, so thought I'd check here to see if anyone has had similar thoughts before. org.apache.fop.fonts.Font contains a fontSize member. I would like to see this separated from the Font instance somehow. Instead of fetching a Font corresponding to given triplet and font-size, we would get one corresponding to the triplet. In the Font-methods that use the font-size, I would then add an int parameter, so the font- size can be passed in by the caller. No idea if this makes sense, or what the initial motivation was to embed the font-size in the Font instance. If there's a good reason, please enlighten me... Once this is done, the basic Font instances corresponding to the triplets as specified on the FO, could be fetched/checked very soon in the process, since the dependency on the only layout-dependent component is removed. Much sooner than is the case now: ultimately the instances are created following the call to CommonFont.getFontState() in TextLM.initialize(). (To be more precise, currently the instance (singular) corresponding to the first specified font-family is created ...) As to then further addressing the font-family fallback/font-selection issue, as indicated, I would replace in CommonFont: private Font fontState; by private Font[] fontStates; Change the getFontState() signature: - either make it public Font[] getFontStates() - or add an extra char parameter to getFontState(), that would allow CommonFont to seek a Font in its list that contains a mapping for the given char Which strategy is preferred? I'm even thinking that maybe it would even be better design not to store/create Font instances in CommonFont at all, and instead of doing so, *always* forward the font- lookup to FontInfo (not only the first time getFontState() is called, as is the case now). WDYT? Cheers Andreas
Re: font selection by character
Hi Andreas, My 2 cents on this, as I have a rather limited understanding of this area. Andreas L Delmelle a écrit : snip/ org.apache.fop.fonts.Font contains a fontSize member. I would like to see this separated from the Font instance somehow. Instead of fetching a Font corresponding to given triplet and font-size, we would get one corresponding to the triplet. In the Font-methods that use the font-size, I would then add an int parameter, so the font-size can be passed in by the caller. No idea if this makes sense, or what the initial motivation was to embed the font-size in the Font instance. If there's a good reason, please enlighten me... The font may depend on the desired size. For example, bitmap fonts have different instances for each size. IIC there's something like that in AFP. Also, some high-quality font families have several fonts for different sizes: one font for headers, with narrower glyphs, one regular font for the body, one for footnotes with wider glyphs, etc. I don't know if that has some connection to your problem, but maybe it's worth keeping that in mind. snip/ HTH, Vincent
Re: font selection by character
On Jul 21, 2007, at 19:25, Vincent Hennebert wrote: Andreas L Delmelle a écrit : snip/ org.apache.fop.fonts.Font contains a fontSize member. I would like to see this separated from the Font instance somehow. Instead of fetching a Font corresponding to given triplet and font-size, we would get one corresponding to the triplet. In the Font-methods that use the font-size, I would then add an int parameter, so the font-size can be passed in by the caller. No idea if this makes sense, or what the initial motivation was to embed the font-size in the Font instance. If there's a good reason, please enlighten me... The font may depend on the desired size. For example, bitmap fonts have different instances for each size. IIC there's something like that in AFP. Good thing I decided to ask first before proceeding and bumping into this myself... :-) Thanks, Vincent. Cheers Andreas
Re: font selection by character
On Jul 20, 2007, at 08:02, Manuel Mall wrote: On Jul 20, 2007, at 05:47, Manuel Mall wrote: snip/ As to how the TextLM should then further handle it, I hadn't really looked deeper into so far, but it seems like you have... 8-) Andreas, how about this as a way forward: You implement the font lists in the Property system / FO Tree and I deal with the implementation of the font selection based on char, Knuth box and area creation logic in TextLM using the font data structures you come up with? Good idea. It should not take me too much time to finish that. Something for the weekend, maybe. Cheers Andreas