[jira] [Resolved] (FOP-3139) Add support for font-selection-strategy=character-by-character

2023-07-20 Thread Simon Steiner (Jira)


 [ 
https://issues.apache.org/jira/browse/FOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Steiner resolved FOP-3139.

Fix Version/s: main
   Resolution: Fixed

https://github.com/apache/xmlgraphics-fop/commit/b16022ece329197f72f47943085d45b56e26806e

> Add support for font-selection-strategy=character-by-character
> --
>
> Key: FOP-3139
> URL: https://issues.apache.org/jira/browse/FOP-3139
> Project: FOP
>  Issue Type: Bug
>Reporter: Simon Steiner
>Assignee: Simon Steiner
>Priority: Major
> Fix For: main
>
> Attachments: test2.fo
>
>
> fop test.fo out.pdf
> All glyphs should map to a font



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FOP-3139) Add support for font-selection-strategy=character-by-character

2023-07-20 Thread Simon Steiner (Jira)


 [ 
https://issues.apache.org/jira/browse/FOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Steiner updated FOP-3139:
---
Description: 
fop test.fo out.pdf

All glyphs should map to a font

> Add support for font-selection-strategy=character-by-character
> --
>
> Key: FOP-3139
> URL: https://issues.apache.org/jira/browse/FOP-3139
> Project: FOP
>  Issue Type: Bug
>Reporter: Simon Steiner
>Assignee: Simon Steiner
>Priority: Major
> Attachments: test2.fo
>
>
> fop test.fo out.pdf
> All glyphs should map to a font



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FOP-3139) Add support for font-selection-strategy=character-by-character

2023-07-20 Thread Simon Steiner (Jira)


 [ 
https://issues.apache.org/jira/browse/FOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Steiner updated FOP-3139:
---
Attachment: test2.fo

> Add support for font-selection-strategy=character-by-character
> --
>
> Key: FOP-3139
> URL: https://issues.apache.org/jira/browse/FOP-3139
> Project: FOP
>  Issue Type: Bug
>Reporter: Simon Steiner
>Assignee: Simon Steiner
>Priority: Major
> Attachments: test2.fo
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FOP-3139) Add support for font-selection-strategy=character-by-character

2023-07-20 Thread Simon Steiner (Jira)
Simon Steiner created FOP-3139:
--

 Summary: Add support for 
font-selection-strategy=character-by-character
 Key: FOP-3139
 URL: https://issues.apache.org/jira/browse/FOP-3139
 Project: FOP
  Issue Type: Bug
Reporter: Simon Steiner
Assignee: Simon Steiner






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: font selection by character

2007-10-19 Thread Jeremias Maerki
Great, thanks for looking into it! AFAIU, your concept is right. The
spec refers to CSS on how to handly the font selection (see [1]).

But I also think that's the minimal algorithm. If you have two Symbol
characters and a space between, it's probably slightly better-looking to
use Symbol's space character, i.e. no font-family change for only
white-space. But I'd stick to what's easier to implement, first.

CSS also defines a font-matching concept which, AFAIU, kicks in when
mere font-selection doesn't help anymore. But IMO, that's another
building site which shouldn't be too difficult to add once we have the
basic infrastructure for char-by-char font selection.

[1] http://www.w3.org/TR/REC-CSS2/fonts.html#propdef-font-family
[2] http://www.w3.org/TR/REC-CSS2/fonts.html#algorithm

Jeremias Maerki



On 19.10.2007 13:50:55 Manuel Mall wrote:
 Because the issue of font selection by character was raised again on 
 fop-user I have started to look at it as the appropriate font data 
 structures are now in place.
 
 I am just looking for a confirmation here of the selection algorithm. My 
 understanding is that for each character, independent of its context, 
 the font list is searched for the first font with a matching glyph. 
 This means in particular for characters like space and hyphen (as used 
 for hyphenation), they will always be taken from the first font 
 containing them even if their 'neighbouring' characters are from a 
 different font?
 
 Manuel



font selection by character

2007-10-19 Thread Manuel Mall
Because the issue of font selection by character was raised again on 
fop-user I have started to look at it as the appropriate font data 
structures are now in place.

I am just looking for a confirmation here of the selection algorithm. My 
understanding is that for each character, independent of its context, 
the font list is searched for the first font with a matching glyph. 
This means in particular for characters like space and hyphen (as used 
for hyphenation), they will always be taken from the first font 
containing them even if their 'neighbouring' characters are from a 
different font?

Manuel


Re: font selection by character

2007-07-29 Thread Andreas L Delmelle

On Jul 25, 2007, at 19:48, Manuel Mall wrote:

Manuel,

Just a little heads-up: I've already made the discussed change to  
CommonFont locally, but since I had already made a few other  
alterations to this class and this particular change is only minimal,  
they will be in the same commit (which should be in roughly 24 hours  
if no objections arise to my cache proposal).


Cheers

Andreas


On Thursday 26 July 2007 00:23, Andreas L Delmelle wrote:

snip /
IOW, the code in the LM could come to look like:

//init()
fontkeys = fobj.getCommonFont().getFontState
(fobj.getFOEventHandler().getFontInfo());

//instantiate the fonts, one ...
initfont = fobj.getFOEventHandler().getFontInfo.getFontInstance
(fontkeys[0], this);
...

//other_method()
//... by one
fallbackfont = fobj.getFOEventHandler().getFontInfo
().getFontInstance(fontkeys[x], this);
//or if desired, create them all at once
//preferably not used for large sets of unloaded fonts (?)
allfonts = fobj.getFOEventHandler().getFontInfo().getFontInstances
(fontkeys, this);



WDYT?


Yes, that will work I think.

Thanks

Manuel


Thanks for the feedback.

Cheers

Andreas






Re: font selection by character

2007-07-25 Thread Andreas L Delmelle

On Jul 25, 2007, at 02:25, Manuel Mall wrote:


On Wednesday 25 July 2007 05:17, Andreas L Delmelle wrote:


Anyone have any objections, for instance, to change the default
behaviour of FontInfo.fontLookup(String[], ...) to something like the
below code? Should we add a getFontInstances() method too, or rather,
since the below method is only used by CommonFont, maybe merge it
into such a getFontInstances() method, that would return Font[]?

To Manuel in particular: what would be most convenient from your
point of view? Is it OK if each LM gets a set of Fonts (or mere
triplet-keys), instead of only one, upon the call to getFontState()?



LMs tend to call these methods only once typically in their initialize
method. A quick search shows that getCommonFont() is called only 8
times in the LMs and in each case the call is the same:

font =
fobj.getCommonFont().getFontState(fobj.getFOEventHandler 
().getFontInfo(),

this);

In summary LMs are not really interested in the CommonFont object as
such. They only use it to get to the Font object (or now we want a  
list

of Font objects) applicable to them. But the LMs would like a Font
object not a Triplet.


OK, just wondering because the FontInfo instance that CommonFont uses  
to do the lookup ultimately gets passed in by the LM. Maybe in some  
cases it could be a more efficient use of resources to have  
CommonFont.getFontState() return only the compact triplet-objects in  
an array, which the LM can then later feed to the getFontInstance()  
method of the FontInfo it has access to. If I remember correctly from  
studying the fonts package, getFontInstance() would also actually  
*load* the font if that is not yet done at the point the method is  
called. If a user were to specify 20 font-families, they would all be  
loaded even when only one of them is actually used. Can't remember  
exactly if Adrian's auto-detection patch changed this in any way.



So, IMO the least impact change is to simply add font[]
getFontStates(...) to CommonFont. How it is implemented internally
doesn't really worry me.


We can even make it a bit more flexible, I think.
Make it FontTriplet[] as a return type, and add the getFontInstances 
(FontTriplet[],...) method to FontInfo.


That way, the LM can ask all Font instances at once (if it wants/ 
needs to), or just the first one to initialize from and create the  
others only if necessary.


IOW, the code in the LM could come to look like:

//init()
fontkeys = fobj.getCommonFont().getFontState 
(fobj.getFOEventHandler().getFontInfo());


//instantiate the fonts, one ...
initfont = fobj.getFOEventHandler().getFontInfo.getFontInstance 
(fontkeys[0], this);

...

//other_method()
//... by one
fallbackfont = fobj.getFOEventHandler().getFontInfo 
().getFontInstance(fontkeys[x], this);

//or if desired, create them all at once
//preferably not used for large sets of unloaded fonts (?)
allfonts = fobj.getFOEventHandler().getFontInfo().getFontInstances 
(fontkeys, this);




WDYT?

Thanks for the feedback.

Cheers

Andreas


Re: font selection by character

2007-07-25 Thread Manuel Mall
On Thursday 26 July 2007 00:23, Andreas L Delmelle wrote:
 On Jul 25, 2007, at 02:25, Manuel Mall wrote:
  On Wednesday 25 July 2007 05:17, Andreas L Delmelle wrote:
  Anyone have any objections, for instance, to change the default
  behaviour of FontInfo.fontLookup(String[], ...) to something like
  the below code? Should we add a getFontInstances() method too, or
  rather, since the below method is only used by CommonFont, maybe
  merge it into such a getFontInstances() method, that would return
  Font[]?
 
  To Manuel in particular: what would be most convenient from your
  point of view? Is it OK if each LM gets a set of Fonts (or mere
  triplet-keys), instead of only one, upon the call to
  getFontState()?
 
  LMs tend to call these methods only once typically in their
  initialize method. A quick search shows that getCommonFont() is
  called only 8 times in the LMs and in each case the call is the
  same:
 
  font =
  fobj.getCommonFont().getFontState(fobj.getFOEventHandler
  ().getFontInfo(),
  this);
 
  In summary LMs are not really interested in the CommonFont object
  as such. They only use it to get to the Font object (or now we want
  a list
  of Font objects) applicable to them. But the LMs would like a Font
  object not a Triplet.

 OK, just wondering because the FontInfo instance that CommonFont uses
 to do the lookup ultimately gets passed in by the LM. Maybe in some
 cases it could be a more efficient use of resources to have
 CommonFont.getFontState() return only the compact triplet-objects in
 an array, which the LM can then later feed to the getFontInstance()
 method of the FontInfo it has access to. If I remember correctly from
 studying the fonts package, getFontInstance() would also actually
 *load* the font if that is not yet done at the point the method is
 called. If a user were to specify 20 font-families, they would all be
 loaded even when only one of them is actually used. Can't remember
 exactly if Adrian's auto-detection patch changed this in any way.

  So, IMO the least impact change is to simply add font[]
  getFontStates(...) to CommonFont. How it is implemented internally
  doesn't really worry me.

 We can even make it a bit more flexible, I think.
 Make it FontTriplet[] as a return type, and add the getFontInstances
 (FontTriplet[],...) method to FontInfo.

 That way, the LM can ask all Font instances at once (if it wants/
 needs to), or just the first one to initialize from and create the
 others only if necessary.

 IOW, the code in the LM could come to look like:

 //init()
 fontkeys = fobj.getCommonFont().getFontState
 (fobj.getFOEventHandler().getFontInfo());

 //instantiate the fonts, one ...
 initfont = fobj.getFOEventHandler().getFontInfo.getFontInstance
 (fontkeys[0], this);
 ...

 //other_method()
 //... by one
 fallbackfont = fobj.getFOEventHandler().getFontInfo
 ().getFontInstance(fontkeys[x], this);
 //or if desired, create them all at once
 //preferably not used for large sets of unloaded fonts (?)
 allfonts = fobj.getFOEventHandler().getFontInfo().getFontInstances
 (fontkeys, this);



 WDYT?

Yes, that will work I think.

Thanks

Manuel

 Thanks for the feedback.

 Cheers

 Andreas


Re: font selection by character

2007-07-24 Thread Andreas L Delmelle

On Jul 21, 2007, at 14:07, Andreas L Delmelle wrote:


snip /
As to then further addressing the font-family fallback/font- 
selection issue, ...



snip /

Change the getFontState() signature:

- either make it
public Font[] getFontStates()

- or add an extra char parameter to getFontState(), that would  
allow CommonFont to seek a Font in its list that contains a mapping  
for the given char



Which strategy is preferred? I'm even thinking that maybe it would  
even be better design not to store/create Font instances in  
CommonFont at all, and instead of doing so, *always* forward the  
font-lookup to FontInfo (not only the first time getFontState() is  
called, as is the case now).


Anyone have any objections, for instance, to change the default  
behaviour of FontInfo.fontLookup(String[], ...) to something like the  
below code? Should we add a getFontInstances() method too, or rather,  
since the below method is only used by CommonFont, maybe merge it  
into such a getFontInstances() method, that would return Font[]?


To Manuel in particular: what would be most convenient from your  
point of view? Is it OK if each LM gets a set of Fonts (or mere  
triplet-keys), instead of only one, upon the call to getFontState()?



Cheers

Andreas

-- sample FontInfo.fontLookup() --
/**
 * Looks up (a set of) font(s).
 * br
 * Locate the font name(s) for the given families, style and  
weight.
 * The font name(s) can then be used as a key as they are unique  
for

 * the associated document.
 * This also adds the font(s) to the list of used fonts.
 * @param families  font families (priority list)
 * @param style font style
 * @param weightfont weight
 * @return the set of font triplets of the supported font-families
 *  in the specified style and weight.
 */
public FontTriplet[] fontLookup(String[] families, String style,
 int weight) {
FontTriplet triplet;
List tmpTriplets = new ArrayList(families.length);
for (int i = 0; i  families.length; i++) {
triplet = fontLookup(families[i], style, weight, (i =  
families.length - 1));

if (triplet != null) {
tmpTriplets.add(triplet);
}
}
if (tmpTriplets.size() != 0) {
FontTriplet[] triplets =
(FontTriplet[]) tmpTriplets.toArray();
return triplets;
}
throw new IllegalStateException(
fontLookup must return an array with at least  
one 

+ FontTriplet on the last call.);
}
--




Re: font selection by character

2007-07-24 Thread Manuel Mall
On Wednesday 25 July 2007 05:17, Andreas L Delmelle wrote:
 On Jul 21, 2007, at 14:07, Andreas L Delmelle wrote:
  snip /
  As to then further addressing the font-family fallback/font-
  selection issue, ...

 snip /

  Change the getFontState() signature:
 
  - either make it
  public Font[] getFontStates()
 
  - or add an extra char parameter to getFontState(), that would
  allow CommonFont to seek a Font in its list that contains a mapping
  for the given char
 
 
  Which strategy is preferred? I'm even thinking that maybe it would
  even be better design not to store/create Font instances in
  CommonFont at all, and instead of doing so, *always* forward the
  font-lookup to FontInfo (not only the first time getFontState() is
  called, as is the case now).

 Anyone have any objections, for instance, to change the default
 behaviour of FontInfo.fontLookup(String[], ...) to something like the
 below code? Should we add a getFontInstances() method too, or rather,
 since the below method is only used by CommonFont, maybe merge it
 into such a getFontInstances() method, that would return Font[]?

 To Manuel in particular: what would be most convenient from your
 point of view? Is it OK if each LM gets a set of Fonts (or mere
 triplet-keys), instead of only one, upon the call to getFontState()?


LMs tend to call these methods only once typically in their initialize 
method. A quick search shows that getCommonFont() is called only 8 
times in the LMs and in each case the call is the same:

font = 
fobj.getCommonFont().getFontState(fobj.getFOEventHandler().getFontInfo(), 
this);

In summary LMs are not really interested in the CommonFont object as 
such. They only use it to get to the Font object (or now we want a list 
of Font objects) applicable to them. But the LMs would like a Font 
object not a Triplet.

So, IMO the least impact change is to simply add font[] 
getFontStates(...) to CommonFont. How it is implemented internally 
doesn't really worry me.

Manuel


 Cheers

 Andreas

 -- sample FontInfo.fontLookup() --
  /**
   * Looks up (a set of) font(s).
   * br
   * Locate the font name(s) for the given families, style and
 weight.
   * The font name(s) can then be used as a key as they are unique
 for
   * the associated document.
   * This also adds the font(s) to the list of used fonts.
   * @param families  font families (priority list)
   * @param style font style
   * @param weightfont weight
   * @return the set of font triplets of the supported
 font-families *  in the specified style and weight.
   */
  public FontTriplet[] fontLookup(String[] families, String style,
   int weight) {
  FontTriplet triplet;
  List tmpTriplets = new ArrayList(families.length);
  for (int i = 0; i  families.length; i++) {
  triplet = fontLookup(families[i], style, weight, (i =
 families.length - 1));
  if (triplet != null) {
  tmpTriplets.add(triplet);
  }
  }
  if (tmpTriplets.size() != 0) {
  FontTriplet[] triplets =
  (FontTriplet[]) tmpTriplets.toArray();
  return triplets;
  }
  throw new IllegalStateException(
  fontLookup must return an array with at least
 one 
  + FontTriplet on the last call.);
  }
 --


Re: font selection by character

2007-07-21 Thread Andreas L Delmelle

On Jul 20, 2007, at 11:56, Andreas L Delmelle wrote:


On Jul 20, 2007, at 08:02, Manuel Mall wrote:


On Jul 20, 2007, at 05:47, Manuel Mall wrote:


snip/

As to how the TextLM should then further handle it, I hadn't really
looked deeper into so far, but it seems like you have... 8-)


Andreas, how about this as a way forward: You implement the font  
lists in
the Property system / FO Tree and I deal with the implementation  
of the
font selection based on char, Knuth box and area creation logic in  
TextLM

using the font data structures you come up with?


Good idea. It should not take me too much time to finish that.  
Something for the weekend, maybe.


I've looked a bit closer, and bumped into something I would like to  
address as well concerning fonts, so thought I'd check here to see if  
anyone has had similar thoughts before.


org.apache.fop.fonts.Font contains a fontSize member. I would like to  
see this separated from the Font instance somehow. Instead of  
fetching a Font corresponding to given triplet and font-size, we  
would get one corresponding to the triplet. In the Font-methods that  
use the font-size, I would then add an int parameter, so the font- 
size can be passed in by the caller.


No idea if this makes sense, or what the initial motivation was to  
embed the font-size in the Font instance. If there's a good reason,  
please enlighten me...



Once this is done, the basic Font instances corresponding to the  
triplets as specified on the FO, could be fetched/checked very soon  
in the process, since the dependency on the only layout-dependent  
component is removed. Much sooner than is the case now: ultimately  
the instances are created following the call to  
CommonFont.getFontState() in TextLM.initialize().
(To be more precise, currently the instance (singular) corresponding  
to the first specified font-family is created ...)


As to then further addressing the font-family fallback/font-selection  
issue, as indicated, I would replace in CommonFont:


private Font fontState;

by

private Font[] fontStates;

Change the getFontState() signature:

- either make it
public Font[] getFontStates()

- or add an extra char parameter to getFontState(), that would allow  
CommonFont to seek a Font in its list that contains a mapping for the  
given char



Which strategy is preferred? I'm even thinking that maybe it would  
even be better design not to store/create Font instances in  
CommonFont at all, and instead of doing so, *always* forward the font- 
lookup to FontInfo (not only the first time getFontState() is called,  
as is the case now).



WDYT?


Cheers

Andreas



Re: font selection by character

2007-07-21 Thread Vincent Hennebert
Hi Andreas,

My 2 cents on this, as I have a rather limited understanding of this area.

Andreas L Delmelle a écrit :
snip/
 org.apache.fop.fonts.Font contains a fontSize member. I would like to
 see this separated from the Font instance somehow. Instead of fetching a
 Font corresponding to given triplet and font-size, we would get one
 corresponding to the triplet. In the Font-methods that use the
 font-size, I would then add an int parameter, so the font-size can be
 passed in by the caller.
 
 No idea if this makes sense, or what the initial motivation was to embed
 the font-size in the Font instance. If there's a good reason, please
 enlighten me...

The font may depend on the desired size. For example, bitmap fonts have
different instances for each size. IIC there's something like that in AFP.

Also, some high-quality font families have several fonts for different
sizes: one font for headers, with narrower glyphs, one regular font for
the body, one for footnotes with wider glyphs, etc.

I don't know if that has some connection to your problem, but maybe it's
worth keeping that in mind.

snip/

HTH,
Vincent


Re: font selection by character

2007-07-21 Thread Andreas L Delmelle

On Jul 21, 2007, at 19:25, Vincent Hennebert wrote:


Andreas L Delmelle a écrit :
snip/

org.apache.fop.fonts.Font contains a fontSize member. I would like to
see this separated from the Font instance somehow. Instead of  
fetching a

Font corresponding to given triplet and font-size, we would get one
corresponding to the triplet. In the Font-methods that use the
font-size, I would then add an int parameter, so the font-size can be
passed in by the caller.

No idea if this makes sense, or what the initial motivation was to  
embed

the font-size in the Font instance. If there's a good reason, please
enlighten me...


The font may depend on the desired size. For example, bitmap fonts  
have
different instances for each size. IIC there's something like that  
in AFP.


Good thing I decided to ask first before proceeding and bumping into  
this myself... :-)


Thanks, Vincent.

Cheers

Andreas



Re: font selection by character

2007-07-20 Thread Andreas L Delmelle

On Jul 20, 2007, at 08:02, Manuel Mall wrote:


On Jul 20, 2007, at 05:47, Manuel Mall wrote:


snip/

As to how the TextLM should then further handle it, I hadn't really
looked deeper into so far, but it seems like you have... 8-)


Andreas, how about this as a way forward: You implement the font  
lists in
the Property system / FO Tree and I deal with the implementation of  
the
font selection based on char, Knuth box and area creation logic in  
TextLM

using the font data structures you come up with?


Good idea. It should not take me too much time to finish that.  
Something for the weekend, maybe.



Cheers

Andreas