Foray's font subsystem for Fop

2005-06-12 Thread Vincent Hennebert

Hi Fop Team and Victor,

I'm considering to adapt Foray's font subsystem to Fop. I have already 
experimented a bit and the thing seems to be rather feasible. So far I have 
encountered two problems:


- logging mechanism: Foray uses the avalon framework while Fop uses commons 
logging. The 2 APIs are similar but I suppose I'll have to convert the avalon 
stuff into commons. Or are there any plans to change the logging mechanism (I'm 
thinking about the FOPAvalonization Wiki page)? Another minor problem will be to 
plug the right logger to the font subsystem. I guess only one logger is created 
and passed through all classes?


- the font subsystem is based on a client/server architecture; the question is: 
which Fop class should be made a FontConsumer? And where should the FontServer 
be created and held? So far I've used FOEventHandler as a FontConsumer and a 
holder of a FontServer. It's quite convenient but I'm not sure at all that it is 
good design; I'm not yet used to Fop's overall architecture.


I welcome any additional thoughts/comments. Now starting to work...

Regards,
Vincent


Re: AW: MathML and barcode support for FOP

2005-07-27 Thread Vincent Hennebert
While we are speaking of that, If I may give my opinion: I agree with Norman 
that using images to render maths isn't a good solution in the long-term. The 
fact that it is SVG improves the situation a bit because fonts will be rendered 
fine, but there are other problems to address: for example it is difficult to 
align the baseline of an inline-rendered equation with the text's baseline. It 
also is not possible to break an equation into multiple lines.


A native MathML renderer will be necessary to beat TeX in this area. I was 
thinking of writing one for Fop but I'm missing time and for now I'm in the font 
subsystem stuff.


The work referred by Siarhei Baidun may definitely be interesting.

Vincent

Jeremias Maerki a écrit :

The MathML extension uses JEuclid to convert the MathML to SVG
internally so we get quite good quality. I don't think it is possible to
create XSL-FO code from MathML because you can't properly place all the
elements. Doing that with SVG is a lot better.

On 27.07.2005 10:54:45 Norman Markgraf wrote:


Sorry to interrupt you all. But I have so concerns using JEuclid for MathML.
I'm not sure if I have the permission to post here, but maybe you will
excuse my post if so.

I am not sure if using JEuclid is the right way to deal with MathML. As far
as I understand JEuclid transforms a MathML expression into an image. If
this is correct, than I would found this the wrong way in principle.
Wouldn't it be nicer if the MathML expression is converted into XSL:FO it
self? I am not very in this field, but as far as I understand MathML (pm)
this should be the way to go. Or do I completely misinterpret something?




Jeremias Maerki





offline

2005-07-30 Thread Vincent Hennebert

Hi all,

I'll be offline during 3 weeks: summer holidays, far from computers ;-)

My work on the font subsystem is getting along, a bit slowly those last days but 
I hope to have more time after holidays. Currently I'm on the pdf part: it's a 
bit difficult because font and pdf things are very intermixed, and are handled 
each time a bit differently in Fop 0.20.5, Fop trunk and FOray.


I don't know if the font subsystem will be ready for the next release, but I 
think that after the pdf part is done the rest should be easier to adapt. I'll 
do my best to make it ready.


Just in case, I've put a patch on Bugzilla for those who might want to see my 
changes during these 3 next weeks. There are conflict issues, problems with the 
new layoutmgr.inline subpackage and the code won't compile. So this is really 
for curious guys that have time to loose ;-)


See you in 3 weeks.
Cheers,
Vincent


Re: Relative font weights and font selection

2005-08-12 Thread Vincent Hennebert

Victor Mote a écrit :

Manuel Mall wrote:


Regarding the bolder, lighter issue and the general font 
selection I looked at the pre-patch for FOrayFont adaptation 
to Fop 
(http://issues.apache.org/bugzilla/show_bug.cgi?id=35948) and 
concluded that meddling with the font selection system will 
interfere with the FOray font integration and that the FOray 
font system has addressed most of the font selection issues 
any way (not sure about the bolder, lighter bits though). 
I will therefore back-off from that line of work and wait for 
the FOray font integration to complete, assuming that it is 
still going ahead.



Sorry to be so slow responding. I think Vincent is taking August off, but is
still working on the font integration work.


I confirm. Still one week offline (I'm connected only tonight) and I get back on 
my work on font integration.




Manuel and I have had an off-line conversation about the bolder/lighter
issue, and I think we will need to improve both the interface and the
implementation to handle this and the similar issues for font-stretch. I'll
work on that in the next week or two.


There was a TODO in the code where bolder and lighter should be handled. 
I've left it as is for now as it is not very important yet. I had the feeling 
that the new font mechanism would ease things but as you say there seems to be 
some work to do. We will have to discuss about that one day...


Cheers,
Vincent


Re: svn commit: r240012 - in /xmlgraphics/fop/trunk/src/java/org/apache/fop/render: pdf/PDFRenderer.java ps/PSRenderer.java

2005-08-25 Thread Vincent Hennebert

Jeremias,

Just in case you intended to do any improvement there: the FOrayFont integration 
may bring some facilities in this area. At least the handling will be different, 
so I don't think it's worth working on this before the integration is done. So 
please leave it as is for now. Thanks!
I've finished reading the huge amount of mails that have been written to this 
list during August, getting back to work now.


Regards,
Vincent

[EMAIL PROTECTED] a écrit :

Author: jeremias
Date: Thu Aug 25 00:28:27 2005
New Revision: 240012

URL: http://svn.apache.org/viewcvs?rev=240012view=rev
Log:
Kerning is currently not supported by the layout engine, so disable it for PDF 
and add a TODO item for PS.

Modified:
xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFRenderer.java
xmlgraphics/fop/trunk/src/java/org/apache/fop/render/ps/PSRenderer.java

Modified: 
xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFRenderer.java
URL: 
http://svn.apache.org/viewcvs/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFRenderer.java?rev=240012r1=240011r2=240012view=diff
==
--- xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFRenderer.java 
(original)
+++ xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFRenderer.java 
Thu Aug 25 00:28:27 2005
@@ -1187,7 +1187,9 @@
 boolean kerningAvailable = false;
 Map kerning = fs.getKerning();
 if (kerning != null  !kerning.isEmpty()) {
-kerningAvailable = true;
+//kerningAvailable = true;
+//TODO Reenable me when the layout engine supports kerning, too
+log.warn(Kerning support is disabled until it is supported by the 
layout engine!);
 }
 
 int l = s.length();


Modified: 
xmlgraphics/fop/trunk/src/java/org/apache/fop/render/ps/PSRenderer.java
URL: 
http://svn.apache.org/viewcvs/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/ps/PSRenderer.java?rev=240012r1=240011r2=240012view=diff
==
--- xmlgraphics/fop/trunk/src/java/org/apache/fop/render/ps/PSRenderer.java 
(original)
+++ xmlgraphics/fop/trunk/src/java/org/apache/fop/render/ps/PSRenderer.java Thu 
Aug 25 00:28:27 2005
@@ -25,6 +25,7 @@
 import java.io.OutputStream;
 import java.util.Iterator;
 import java.util.List;
+import java.util.Map;
 
 // FOP

 import org.apache.avalon.framework.configuration.Configuration;
@@ -713,7 +714,16 @@
 handleIOTrouble(ioe);
 }
 }
-//paintText(rx, bl, , f);
+
+boolean kerningAvailable = false;

+Map kerning = tf.getKerningInfo();
+if (kerning != null  !kerning.isEmpty()) {
+//kerningAvailable = true;
+//TODO Fix me when kerning is supported by the layout engine
+log.warn(Kerning info is available, but kerning is not yet implemented 
for
++  the PS renderer and not currently supported by the layout 
engine.);
+}
+
 String text = area.getTextArea();

 beginTextObject();
 writeln(1 0 0 -1  + gen.formatDouble(rx / 1000f) 




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Relative font weights and font selection

2005-08-26 Thread Vincent Hennebert

Victor Mote a écrit :

I am ignoring font-stretch for now. I am unclear whether it works similarly
to font-weight, or whether it is totally resolvable in the FO Tree.
Interestingly, CSS 2.1 (the only version of CSS 2 still available at W3C)
removes font-stretch entirely!!??!!


As I understand the spec, this works differently from font-weight and can be 
resolved in the FO Tree: just select the next expanded value for wider or next 
condensed for narrower. The font selection would be performed only after, when 
it is time to decide e.g. which font the keyword semi-expanded matches.
That's true that it is an extra-feature that IMO can be simulated with a good 
font configuration file.




For font-weight, there seems to be some ambiguity in the standard(s). There
are two possibilities, and neither CSS 2.1 nor XSL-FO seem to resolve the
matter:

1. Apply bolder and lighter to the inherited font to compute a weight
that is applied to the selected font.
2. Select the font, inheriting the weight from the inherited font, then
applying bolder and lighter to that weight.


I'd go with 1. Get the inherited font; find a darker one in the fonts database; 
get its weight value. That's it.



In order to move forward, I suggest the addition of the following methods in
org.axsl.font.Font:

public byte nextBolderWeight();
public byte nextLighterWeight();
public org.axsl.font.Font nextBolderFont();
public org.axsl.font.Font nextLighterFont();

This will allow the client application (FOP) to use whichever algorithm it
thinks is appropriate. The bad news is that this ties each registered font
to exactly one font-family, something I was hoping to avoid.


That seems OK.
The only interest I see for a font to belong to several families is when there 
is a specific family (Times, Helvetica) and a generic one (serif, 
sans-serif...). In this case a generic family would be mapped to a specific one, 
and I don't think your proposed methods prevents that. Otherwise I don't see 
much interest to mix several families to build a complete set. The result would 
be visually bad IMO.

I may have missed something: I haven't studied that point yet.



There is another area complexity in font selection that has not yet been
addressed, so I pose it here to Vincent and Manuel especially, and to any
others who wish to comment. The whole issue of whether the Font has a glyph
for the character(s) has not yet been addressed. The best idea I have for
this is as follows:

1. Add a char to the signature of org.axsl.font.FontServer.selectFont. This
char represents the first char of the text for which the font is being
selected. This allows the selection process to pass by a font-family if it
cannot paint the character.


So let's assume that I have a line of text to render. IIUC I would use it like 
this:
* first call with the first char of the text to get the font that will be 
generally used
* an additional call for each character for which there is no glyph in the 
general font

Is that what you mean?



2. Add the following method to org.axsl.font.Font:
/**
 * Examines each character in string to ensure that a glyph exists in
the font for that
 * character. If a character has no glyph in the font, the character's
index in string
 * is returned.
 * @return The index in string of its first character for which no glyph
exists in this
 * font. If all characters in the string have glyphs in this font, -1 is
returned.
 */
public int unavailableChar(String string);

Add also an overridden version of this method with char[] as the
parameter.


Why not directly return an array of all indexes where there is a missing glyph? 
Or add a beginIndex parameter so that one doesn't have to artificially recreate 
a String made of the initial String minus all characters up to the first missing 
glyph?





Between these two, I think an application should be able to efficiently
subdivide a chunk of text based on the various fonts that may need to be
used to process it.


In the long-term the font-selection-strategy will have to be implemented. The 
preceding stuff may need to be completed.





Comments on any of this are very welcome. I had hoped to defer some of these
font selection issues for a while yet, and you guys are frankly ahead of me
in needing to resolve them, so I will be glad to react to those who may have
thought it through more than I have.


I wish I could be more helpful, but I haven't considered all aspects of the 
problem yet and I don't catch the whole point. I'd like to first finish the font 
integration work.

IMHO this feature is for now not that important. What do other committers think?

Vincent



Re: Relative font weights and font selection

2005-08-27 Thread Vincent Hennebert

Victor Mote a écrit :
As I understand the spec, this works differently from 
font-weight and can be resolved in the FO Tree: just select 
the next expanded value for wider or next condensed for 
narrower. The font selection would be performed only after, 
when it is time to decide e.g. which font the keyword 
semi-expanded matches.
That's true that it is an extra-feature that IMO can be 
simulated with a good font configuration file.



Just to be clear, I understand your last sentence to be addressing a
different topic than the first part of this statement. That is, font
configuration won't be at all involved with the *resolution* of font-stretch
in what you have proposed. However, it may be involved from the standpoint
of implementing a resolved font-stretch value in that font-stretch could be
simulated using PostScript or PDF text parameters. Did I understand this
correctly?


Yes, except the end: we agree that it would not be the purpose of the font 
config file to solve font-stretch. What I meant is that we could use a 
workaround by specifying different font families for expanded fonts; e.g. one 
family Times-Normal and one Times-Expanded, instead of one family Times with to 
font-stretch variants, Normal and Expanded. The user, instead of changing the 
font-stretch property, would change the font-family.




For all of this, probably the best approach is for someone to do exactly
what you have done above: suggest changes to the interface that will provide
the information needed.


I'll put the problems of font-stretchability and glyph substitution on my 
personal todo list. I may consider those problems later, when I'm (at last!) 
finished with the font integration work.


Vincent


Re: [Xmlgraphics-fop Wiki] Update of ReleasePlanFirstPR by ChrisBowditch

2005-08-31 Thread Vincent Hennebert

Chris,

I'm afraid I don't agree with you here. You seem to mix up space conditionality 
and space precedence. See § 4.3 of the spec, and especially § 4.3.1, 
Space-resolution rules [1]
Conditionality only stands for spaces that begin a reference area; as per the 
first rule, all of the conditional spaces that begin a reference area are discarded.
Now for remaining spaces the precedence is to be considered. The space with the 
highest precedence wins.


In your sample (I have added a space-after for clarity):
fo:root xmlns:fo=http://www.w3.org/1999/XSL/Format;
fo:layout-master-set
fo:simple-page-master master-name=a4 page-width=210mm 
page-height=297mm margin=5mm
fo:region-body margin-top=90mm margin-bottom=90mm 
background-color=blue/

/fo:simple-page-master
/fo:layout-master-set
fo:page-sequence master-reference=a4
fo:flow flow-name=xsl-region-body font-size=18pt
fo:block space-before=5mm space-before.conditionality=retain
  space-after=10mmThere should be 5mm of space before this 
block/fo:block
fo:block space-before=5mm 
space-before.conditionality=discardThere should be 10mm of space before this 
block

/fo:block
/fo:flow
/fo:page-sequence
/fo:root

The second fo:block does not begin a reference area, so space conditionality 
isn't taken into consideration. For both spaces, precedence is not specified so 
the default value of 0 is used (§ 7.10.5  7.10.6). The third rule of § 4.3.1 
states that between the two spaces of the same precedence, the one that has the 
highest (optimum) value wins; here the space-after of the first block.


Imagine a scenario where you have many different documents generated by 
different stylesheets, which all share a common styleset, i.e. an 
imported XSL file containing xsl:attribute-sets. The styles defined 
there are used throughout the documents. Now in some documents you might 
have two adjacent paragraghs that both use styles with 
space-before=10pt and after=10pt.


In the document it is desirable only to have 10pt between the paragraphs 
but how this be achieved? The paragraphs need to use the styles they are 
using for other reasons. This is where discard comes in handy. Add 
conditionality=discard to the two styles and then the space from one 
gets dropped.


Here we disagree: if I understand the spec correctly this is precedence that 
should be used, e.g. like in the following:
fo:block space-after=10mm space-after.precedence=100There should be 
10mm of space after this block/fo:block
fo:block space-before=10mm space-before.precedence=200There should be 
10mm of space before this block/fo:block


Here the space-before of the second block has a higher precedence than the 
space-after of the first one, and thus wins. The resolved space is 10mm.



 As per the testcase spaces-block2, I similarly think there should be a
 space between the first and second block on page 2; anyway, in this
 case the actual behaviour is probably wrong as the space resolution
 rules (if I understand them correctly) seems to imply that it should
 be only 10 points.


 Yes, that's right there should be less space - not no space.

I think Lucas is right here. The resolved space should be 10pt, that is the 
optimum space of the two spaces of same value and same precedence. If .minimum 
and .maximum were specified, the resolved space would take the minimum of the 
two .minimum values and the max of the two .maximum values for its own .min and 
.max values.


Do you agree?

Vincent


[1] http://www.w3.org/TR/xsl/slice4.html#area-space



Re: [Xmlgraphics-fop Wiki] Update of ReleasePlanFirstPR by ChrisBowditch

2005-08-31 Thread Vincent Hennebert

Chris Bowditch a écrit :


The second fo:block does not begin a reference area, so space 
conditionality isn't taken into consideration. For both spaces, 
precedence is not specified so the default value of 0 is used (§ 
7.10.5  7.10.6). The third rule of § 4.3.1 states that between the 
two spaces of the same precedence, the one that has the highest 
(optimum) value wins; here the space-after of the first block.



Well the current implementation doesn't work like that. Both spaces are 
included to give 20pt of space between the two paragraphs.


Then either the implementation is broken (i.e. the spec was misunderstood), or 
this is a not yet implemented feature. No flame here, just for clarity.





I'm not an expert in the details of the spec, but isnt the precendence 
ignored unless conditionality=discard?


No, in fact both notions are orthogonal: conditionality only deals with space 
beginning a reference-area, precedence deals with priorities between several 
successive spaces. They both work independently.





Yes I do agree, with the details you describe. But I wasn't trying to 
drill into detail, I was just saying it's not quite right yet. So my 
point still stands: there is some work still required to get this 
working 100%.




That's fine. I replied because I thought this could help understanding the 
process. We agree that it is yet WIP.
I wish I could do something in this area, as I find this functionality very 
powerful, but I'm currently concentrated on FOrayFont.


Vincent



Re: [Xmlgraphics-fop Wiki] Update of ReleasePlanFirstPR by ChrisBowditch

2005-08-31 Thread Vincent Hennebert

Manuel Mall a écrit :


I would have thought its more of a nice to have but not a requirement 
for this release.


Exactly. If FOrayFont is ready for this release, all the better.
It's difficult to say if it will be. The pdf library is now converted, 
PDFRenderer almost.
I think the PSRenderer and SVG will demand most work. I currently don't know the 
impact on Java2D and RTF renderers.


And when the integration is done there will have to be much debugging and 
non-regression tests IMO...


Vincent



Re: e-g with padding and borders

2005-09-01 Thread Vincent Hennebert
I'm not sure here. The fo:external-graphic uses the large-allocation-rectangle 
(§ 6.6.5), that comprises padding and border. This makes me say that in Manuel's 
example the fo:block's bpd should be calculated with the second formula. The 
fo:block's content forms a line whose line-stacking-strategy is max-height 
(default). Thus its allocation rectangle should comprise the image's border  
padding (§ 4.5). And so does the block.


I may be wrong, as this part of the spec is still somewhat unclear to me.
WDYT?

Vincent

Jeremias Maerki a écrit :

Indeed, the normal allocation rectangle of an inline area is different
than the one of a block area. See 4.3.2. Geometric Definitions in the
1.0 spec.

Border and padding for an inline area seem to be outside the allocation
rectangle in before and after directions. Interesting.

On 01.09.2005 17:29:50 Manuel Mall wrote:

I have a follow-up question on this. If we have something as simple(?) 
as this:


fo:block background-color=orange
  fo:external-graphic
	src=../../resources/images/bgimg300dpi.jpg 
   border=solid 5pt 
   padding=5pt 
   background-color=white/

/fo:block

would you expect the whole image including padding and borders to be 
within the bounds of the enclosing block or only the actual image to be 
in the block and the padding and borders to stick out at the top and 
bottom. It seems xep takes the latter approach and I am very uncertain 
in this area. Or to put it differently is the BPD of the enclosing 
block 
  bpd = image height + line-spacing 
or
  bpd = image-height + top_and_bottom_borders + top_and_bottom_padding 
+ line-spacing

?

Manuel
snip/





Jeremias Maerki





Re: [Xmlgraphics-fop Wiki] Update of ExtensionPoints by JeremiasMaerki

2005-09-02 Thread Vincent Hennebert

Luca,

I'm speaking here as a (future) Fop user. Just to let you know that I'm 
definitely wanting to support you in this area. I think your extensions would 
make Fop an extremely powerful typesetting system, that would eventually beat 
TeX in the quality of page makup. It's all the more interesting for me since my 
use of Fop would be to produce book-style documents.


Just a comment about your Wiki page: I'm not sure that modifying margins would 
produce visually appealing results. May it not disturb the reader when she 
notices that margins aren't the same after turning a page?

Otherwise I agree with all of your other propositions.

I wish you good success,
Vincent

Luca Furini a écrit :
Speaking of extensions, I'd like to resurrect the layout extensions that 
were part of the code used to start the Knuth branch, but I want to be 
sure I'm allowed to do it.


The set of extensions (a couple of new properties, and some new value 
for an existing one) is aimed to give the user more control about the 
page breaking: in particular, via these extensions it is possible to 
give the application a list of properties that can be adjusted in order 
to fill all the available bpd of a region (in addition / substitution to 
the spaces between blocks [1]).


I started writing a wiki page about these extensions on the wiki at 
http://wiki.apache.org/xmlgraphics-fop/LayoutExtensions (I really should 
take some time to finish it!).


My highest-priority, short-term task is still to fix the behaviour of 
page-number and page-number-citation, as I think these formatting object 
must work in the next release: I have almost done, just have to finish 
handling the case of justified ext. After that, obviously if there are 
no objections against this, I'd like to spend some time on the 
extensions, that I'm sure could come in handy for fop-users producing 
book-style (or report-style) documents.


For example, here is a link to a message in the xsl-editors mailing list 
requesting a feature which is completely equivalent to one of the layout 
extensions: 
http://lists.w3.org/Archives/Public/xsl-editors/2005JulSep/0007.html 
(many thanks to Jeremias for pointing it out to me!). Should I be 
allowed to keep working on this subject, I could answer him that fop 
will soon be able to cope with his request.


Regards
Luca

[1] ... which makes me think that I should work on space resolution 
rules too ... my to-do list keeps growing longer and longer! :-(







Re: e-g with padding and borders

2005-09-02 Thread Vincent Hennebert

Jeremias Maerki a écrit :

The real problem IMO is probably block-level content in fo:inlines again.
How are these borders to be painted? A border around each
inlineblockparent (one for each block inside the inline)? I'm not sure
judging from the specification.


Here the spec starts being really complicated. I would say you're right, thought 
not sure. See the last sentence of § 4.2.2: Unless otherwise specified, the 
traits of a formatting object are present on each of its generated areas, and 
with the same value. (However, see sections [4.7.2 Line-building] and [4.9.4 
Border, Padding, and Background].). The referred sections don't seem to hold 
for the fo:inline case.
What disturbs me is that when one specifies a border around a chunk of text and 
there is line-breaking, this border should appear and the end of the first line 
and the beginning of second line, as below:

 
  This is a | chunk of text |
-
   __
  | with border | blah blah
  ---

  blah blah

What is more intuitive and could be expected by a user is the following:
 __
  This is a | chunk of text
---
  _
  with border | blah blah
  -

  blah blah

but IIUC this is not allowed by the spec. I ask for confirmation here.

So the example you provided with the 2 fo:blockblah blah/fo:block is 
rendered correctly in terms of borders (but there should be no space between 
them, probably part of the rendering problem you raised).


Vincent



Re: e-g with padding and borders

2005-09-02 Thread Vincent Hennebert
What disturbs me is that when one specifies a border around a chunk of text and 
there is line-breaking, this border should appear and the end of the first line 
and the beginning of second line, as below:

 
  This is a | chunk of text |
-
   __
  | with border | blah blah
  ---

  blah blah

What is more intuitive and could be expected by a user is the following:
 __
  This is a | chunk of text
---
  _
  with border | blah blah
  -

  blah blah

but IIUC this is not allowed by the spec. I ask for confirmation here.



I would agree that this is not allowed by the spec. The traits are the
same for all areas. There don't seem to be any exceptions. Actually, I'm
glad there aren't that would complicate things even more. :-) But maybe
someone who thinks this would be an important feature could probably
write an extension for that. :-)


I've just checked: with CSS this is the second layout which is rendered. So 
there would be an incompatibility here between XSL-FO and CSS, which is 
astonishing as the spec claims several times to promote compatibility.


Anyway, it's not an important feature for me :-]

Vincent


Re: e-g with padding and borders

2005-09-02 Thread Vincent Hennebert

Hi Andreas,

You're right. Indeed both situations below are handled by the standard, thanks 
to border conditionality and is-first/is-last traits.


Thanks for the pointer!

Vincent

Andreas L Delmelle a écrit :

On Sep 2, 2005, at 17:44, Vincent Hennebert wrote:


Hi,

snip /


 
  This is a | chunk of text |
-
   __
  | with border | blah blah
  ---

  blah blah

What is more intuitive and could be expected by a user is the 
following:

 __
  This is a | chunk of text
---
  _
  with border | blah blah
  -

  blah blah


snip /

Hmm... I remember reading something about this --wait a minute... Yep! 
Got it.


See Rec 4.3.1 Space resolution rules
all the way down
The border or padding at the start-edge or end-edge of an inline-area I 
may be specified as conditional. If so, then it is set to zero if its 
associated edge is a leading edge in a line-area, and the is-first trait 
of I is false, or if its associated edge is a trailing edge in a 
line-area, and the is-last trait of I is false.


(see also: 7.7.9 border-before-width .. XSL modifications to the CSS 
Definition)


By default, the first would be applicable. If the user explicitly specifies
border-start-width.conditional=discard, the result would have to be 
the second.


No extension needed.


Cheers,

Andreas





Re: Logging for FOrayFont

2005-09-03 Thread Vincent Hennebert

Hi Victor,

What I liked with the Avalon Logger is the one-to-one correspondance between it 
and Commons' Log; commons just has one more level which is trace. So writing a 
Logger adapter that delegates logs to a Log instance is trivial.


Now it's different because PseudoLogger has 7 log levels + 1 debug level, 
whereas commons Log has 6 levels with different purposes. The best mapping that 
I see is the following:

PseudoLogger - Log
finest  trace
finer   trace
finetrace
debug   debug
config  info
infoinfo
warning warn
severe  error

Log's fatal level wouldn't be used. Writing an adapter in the other way would 
have been somewhat easier (and BTW corresponds to commons' Jdk14Logger).


Personally I tend to find Commons log levels more intuitive and useful than the 
Jdk ones: I don't really know what to do with 3 fine, finer, finest and one 
config levels. May I suggest you to use Commons' style of levels instead?


That said, this is by no means dramatic. For me it's just a matter of writing 
another wrapper.


I agree that it's a bit cleaner if the font system has its own logging rules, 
independently of other existing logging systems. So no problem for me.


Vincent

Victor Mote a écrit :

I just completed a project to make FOray's logging a bit more flexible. It
now logs from an interface called org.axsl.common.PseudoLogger. Logging
levels are the same as those for java.util.logging.Level (in Java 1.4 and
higher), except that integrals are used instead of Level instances.

I also wrote an implementation org.axsl.common.AvalonLogger, which FOray
uses (for now) when it needs to *create* a logger. Since all loggers in the
font system are supplied to the font system (instead of created within it),
FOP should simply pass a different implementation to keep its logging
consistent within itself. The AvalonLogger is a thin wrapper around an, er,
Avalon ConsoleLogger, and is essentially an Adapter between the Avalon
logging system and PseudoLogger. A similar approach can be used with
whatever logging system FOP decides it wants to use. Writing the adapter
should be fairly trivial, and it should be possible to use any logging
system with this approach.

I hope this makes the integration work a bit easier and the results more
satisfactory to FOP. Please let me know if you have questions.

Victor Mote



Re: Logging for FOrayFont

2005-09-04 Thread Vincent Hennebert

Victor Mote a écrit :

Actually there is not a level named debug, although I might have defined
that constant equal to finest in one of the earlier versions.
This does not appear in CVS. I would suggest you to redefine such a constant to 
remove any ambiguity, as as you can see it confused me.



Here is the
way I mapped the Avalon levels in the AvalonLogger implementation:
http://cvs.sourceforge.net/viewcvs.py/axsl/axsl/axsl-common/src/java/org/axs
l/common/AvalonLogger.java?view=markup

FINEST  debug
FINER   info
FINEinfo
CONFIG  info
INFOinfo
WARNING warn
SEVERE  error
Why not. Is I know now that debug corresponds to finest I'll follow the same 
scheme for commons Log.



I don't really feel strongly about it either, but perhaps a bit more
strongly than you for the following reasons:
1. From a sheer standard aspect, I wanted to stay as close to the Java
logging system as possible. I would have used the java.util.logging.Level
instances (for type safety) instead of numeric constants, except for trying
to retain Java 1.3 compatibility.
2. I prefer to allow for more granularity rather than less (within reason),
even if we don't think we need it right now.
3. This is one of those things that you can change on Tuesday to make one
party happy, then change back again on Wednesday to make another party
happy, all for very little benefit. In short, there is no way to make
everyone happy.

I understand your concerns and agree with them.



Also, I don't know if you noticed the following methods:
info(String message)
warn(String message)
error(String message)
debug(String message)
which correspond directly to the Avalon methods of the same name, and are
intended to provide a sort of mapping for them.

Certainly, but I also have to map the logMessage method...


I don't mind adding one more
called trace(String message) if that would make the mapping concept more
clear for you.
Well, no need I think; as trace is below debug and debug is mapped to finest, 
there is no corresponding log level for trace.


I'm satisfied with your explanations. Please just add a LEVEL_DEBUG constant and 
I'm OK with your interface.


Regards,
Vincent




Re: regions and writing-mode

2005-09-04 Thread Vincent Hennebert

Hi Manuel,

Sorry for the delay.
I think you're right.

See the note in 6.4.12 fo:simple-page-master: For example, if the writing-mode 
of the fo:simple-page-master is lr-tb, then [region-body, region-before, 
region-after, region-start, and region-end] correspond to the body of a 
document, the header, the footer, the left sidebar, and the right sidebar.


And 6.4.14 fo:region-before: This region specifies a viewport/reference pair 
that is located on the before side of the page-reference-area. In lr-tb 
writing-mode, this region corresponds to the header region.


This should answer your question.

HTH,
Vincent

Manuel Mall a écrit :

This is (again) more of a clarifying question as I am looking in that area
of the code and I think its incorrect:

Am I correct in saying: The position of the before/after/start/end regions
on the output media is relative to the writing-mode and reference
orientation on the simple-page-master they belong to? 


Currently some of their positioning is determined by the writing-mode set on
the regions themselves, which usually would be the same as on the
simple-page-master, but it can be different and then the current
implementation seems to get itself confused.

Manuel


Re: Logging for FOrayFont

2005-09-05 Thread Vincent Hennebert
I'm satisfied with your explanations. Please just add a 
LEVEL_DEBUG constant and I'm OK with your interface.



OK, I have added the constant LEVEL_DEBUG back, and have also added a new
one called LEVEL_TRACE.
PLEASE NOTE: LEVEL_DEBUG is now equal to LEVEL_FINER (it previously was
equal to LEVEL_FINEST), and LEVEL_TRACE has been set equal to LEVEL_FINEST.
These changes have been made to better accommodate what I understand the
Commons Logging levels to be.

This makes the Avalon mapping look like this:
FINEST  debug
FINER   debug
FINEinfo
CONFIG  info
INFOinfo
WARNING warn
SEVERE  error


That's fine for me!

Thank you,
Vincent


User config file currently discarded

2005-09-06 Thread Vincent Hennebert

Hi,

By trying to debug my FOrayFont adaptation I noticed that the user config file 
currently isn't taken into account by the Trunk.


The apps.FOUserAgent.getConfig() method is actually never called within the 
code, and (as a consequence I suppose) neither is the 
PDFRenderer.configure(Configuration) method, whose purpose is among other things 
to register fonts specified in the user config file.


Is there a particular reason for this situation? A simple fix would be to call 
configure within the AbstractRenderer.setUserAgent method, where we can get the 
user config file associated with the UserAgent given as parameter.
This is perhaps not the right place to do that? But if it's ok I can provide a 
patch.


Vincent


FOrayFont, PS/PDFTranscoders and SVG handling

2005-09-12 Thread Vincent Hennebert
I'm about to convert the SVG library to FOrayFont. But the Batik side seems to 
be reluctant to see the transcoders converted to FOrayFont [1].
How should I handle that? I guess I should leave existing files as is and 
provide new files corresponding to the FOrayFont implementation? How should I 
name them? Perhaps a new subpackage?

For pdf, does it concern other files than those in the svg subpackage?
Which files in the render.ps subpackage are concerned?
What about the pdf library?

All this is still a bit unclear in my head.
In two words: please help...

Vincent


[1] http://marc.theaimsgroup.com/?l=fop-devm=112600990201878w=2


Re: FOrayFont, PS/PDFTranscoders and SVG handling

2005-09-12 Thread Vincent Hennebert

Jeremias and Victor,

thanks for the hints. I keep them under the hand for later, when it is time to 
migrate the stuff into XML Graphics Commons.
For now I just override current implementations with FOrayFont. Anyway it will 
possible to recover them with svn, in case they have to coexist.


Vincent


PSDocumentGraphics2D and Font dictionary

2005-09-12 Thread Vincent Hennebert
In PSDocumentGraphics2D.writeFileHeader (and also in 
PSRenderer.startPageSequence) the font dictionary is written into the PS file by 
a call to PSFontUtil.writeFontDict.
At this time all of the fonts present in the fontInfo (defaults + those found in 
the config file) seem to be written out, even those that won't be used in the fo 
file.


I'm a bit worried because I can't reproduce that easily with FOrayFont. All I 
can get is the set of fonts that were used within the document. I guess that 
rendering starts as soon as possible and that at the time when the file header 
is written out the whole document may not have been entirely parsed yet? (but 
the PDFRenderer only stores used fonts by making a call to 
FontInfo.getUsedFonts!? This also is the case in PSRenderer.stopRenderer).


So the question is: is there a mean to only put used fonts when writing out a PS 
font dictionary? This would be cleaner anyway.


I hope I'm clear.
Vincent


Re: PSDocumentGraphics2D and Font dictionary

2005-09-12 Thread Vincent Hennebert

Well, so there is no simple solution :-(

I could probably add a method like getConfiguredFonts in the font server to put 
in the postscript file all of the fonts defined in the config file. But that 
really sounds dirty to me.


A temporary solution (before implementing a two-pass approach) would be to only 
support Base14 fonts; BTW, are these fonts well defined in the postscript 
standard? Or do they only exist in PDF?
And a somewhat related question: how does font embedding work in postscript? I 
believe that it is like in PDF: embedding is not mandatory, one can simply put 
the font name in the file, and this will work if the corresponding font is 
installed on the client system. So this should almost always work for the fonts 
corresponding to the PDF Base14, and not always for others. Is there a 
font-naming convention?


So, depending on the answers to the preceding questions: what do we choose? 
Systematic font embedding or only putting the font name?


Thanks,
Vincent


Jeremias Maerki a écrit :

I know exactly what you mean. The only way around this is to do a
two-pass approach when writing PostScript, meaning that you keep track
of resources (like fonts) while writing the pages and later you put
together the complete PostScript document by including the needed
resources in the right places. Obviously, that means loosing a lot of
processing speed. PDF is in a better position because it's a
random-access file format while PS is streaming. We can add the font
objects to the PDF after we've already used them. On the other side, the
PDF generated this way cannot be not a linearized file which allows
Fast Web View. The browser always has to load the whole PDF file to
display it because the cross-reference table is at the end of the file.
So, even PDF has, in a way, the same problem.

So you see: the problem is speed versus beauty.

BTW, that was the reason why I started introducing a better resource
handling with PS support, so we can later add such a mode where we write
the PS file in a two-pass approach.

On 12.09.2005 21:40:11 Vincent Hennebert wrote:

In PSDocumentGraphics2D.writeFileHeader (and also in 
PSRenderer.startPageSequence) the font dictionary is written into the PS file by 
a call to PSFontUtil.writeFontDict.
At this time all of the fonts present in the fontInfo (defaults + those found in 
the config file) seem to be written out, even those that won't be used in the fo 
file.


I'm a bit worried because I can't reproduce that easily with FOrayFont. All I 
can get is the set of fonts that were used within the document. I guess that 
rendering starts as soon as possible and that at the time when the file header 
is written out the whole document may not have been entirely parsed yet? (but 
the PDFRenderer only stores used fonts by making a call to 
FontInfo.getUsedFonts!? This also is the case in PSRenderer.stopRenderer).


So the question is: is there a mean to only put used fonts when writing out a PS 
font dictionary? This would be cleaner anyway.


I hope I'm clear.
Vincent





Jeremias Maerki



Re: PSDocumentGraphics2D and Font dictionary

2005-09-13 Thread Vincent Hennebert
Let's look at it from another side. If someone writes some 
kind of FO editor or a configuration tool for FOray/FOP a 
method that reports all available fonts will certainly be useful. :-)


OK. That makes sense. To avoid wasteful parsing, it will mean that at least
3 new classes need to be exposed through interfaces (RegisteredFont,
RegisteredFontFamily, and RegisteredFontDesc), which may be a good thing
anyway.


Yes, I think it could be interesting. It would also be necessary to add 
getStream methods, now that font parsing is delegated to the font server. 
Currently there is only one getPDFFontFileStream method. There should perhaps be 
also a getPSFontFileStream, and something like getPDF/PSSubset? It seems that 
the client is unable to make font subsetting with the current interface.


RegisteredFontFamily and RegisteredFontDesc might also be interesting for the 
AWT renderer, but that's another purpose. I'll perhaps come back on this later.



(more below)


Very good. It sounds like you and I may end up with API visions that match
better than I might have thought at one time.


Actually, you are no longer tied to WinAnsi. We have a lot more 
flexibility on encodings than before:
1. All of the predefined encodings for both PostScript and PDF are 
available to either platform -- of course, if they are not 


predefined 


for the platform used, they must be written into the output.
2. Both platforms have access to the font's internal encoding.
3. The user can specify custom encodings through the 
font-configuration file.


So, if a PostScript document can use the font's internal 


encoding, and 

if the font is known to already be available to the interpreter, I 
think it could safely be used by name. But perhaps I have 


forgotten something.

No, that's true. I simply haven't cared, yet, about finding 
out how glyphs are accessed on-the-fly in PS that are not 
accessible through the encoding. Rewriting the encoding seemed easier.



I am very sure that for Type 1 fonts, specifying another encoding is the
only way to get it done. There is just no way to get more than 256
combinations out of 8 bits and there is no way to get more than 8 bits.
However, the good news is that I am 99% sure that for both PDF and
PostScript you can specify the same underlying font with two (or more)
different encodings. They will actually show up as two different font
objects in the document and must of course be referred to that way also.
I'll let you know how that turns out.


This may require a new font-configuration item for the font element 
that allows it to tell whether it is known to be available to the 
PostScript interpreter. There are some other possibilities 


here as well.

I bet. Sounds good.



The more I think about it, encapsulating the characteristics of a specific
PostScript interpreter is probably the right way to go. Then the rendering
run can use that to decide whether the font needs to be embedded or not.
I'll have to ponder that for awhile.


Here I'm beginning to get lost because I don't know the Postscript standard.


My hope to get ready before the upcoming realease starts vanishing... :-(
Here's my summary of the current discussion:
1. Currently the Fop PSRenderer embeds all of the configured fonts in the PS 
file, even those that will never be used. It does this by parsing itself the 
font files;
2. I can't reproduce this behavior with aXSL and FOray easily, because I've no 
direct access to the font files;
3. Still doing this would require hacking the FOrayFont subpackage; that would 
result in something dirty but that should work;
4. Anyway there are several improvements to bring to the PS renderer: mainly 
character encoding, font embedding and in a longer term two-pass rendering for a 
proper font handling.


Now I'm thinking of the next release: simply putting the font name in the 
postscript file would be rather straightforward to implement, and should work 
for most of cases (?), thanks to the non-standard but well-known base14 (and 
even base35) font set. But that's definitely a regression from the current state.
Improving the PS renderer to allow proper embedding will require (1) changes to 
 the aXSL interfaces (so a certain amount of discussions), (2) me to learn 
Postscript. That would prevent the FOrayFont subsystem from being integrated in 
the pre-release.


Do you agree with my summary?

Integrating FOrayFont in the pre-release would be great...
Deciding to delay the integration would give me more time to investigate the 
insides of FOrayFont, learn PS and PDF standards and so do things much better.


If there is a decision to make it does not belong to me...
Vincent


Re: PSDocumentGraphics2D and Font dictionary

2005-09-13 Thread Vincent Hennebert

Victor Mote a écrit :

I am not sure what you mean getPDF/PSSubset.


If I'm correct it is only possible to embed the whole font file in a pdf output, 
by using getPDFFontFileStream. Currently aXSL doesn't seem to provide a means to 
embed only a subset.




Point me to the FOP code that does the embedding, class name(s) and line
numbers, and I'll see if I can extract it into an aXSL-exposed method.


The whole code is in the class render.ps.PSFontUtil, mainly the method 
embedType1Font.



3. Still doing this would require hacking the FOrayFont 
subpackage; that would 
result in something dirty but that should work;


Better would be to just make aXSL provide what needs to be provided. If we
can hack FOray to do it, then we should be able to expose what is needed.
Since nothing we are talking about here is a pollution of the interface, we
should just be able to change the interface.


On this point I was more thinking of a quick short-term solution for the 
pre-release, before taking the time to think about a clean implementation.



4. Anyway there are several improvements to bring to the PS 
renderer: mainly 
character encoding, font embedding and in a longer term 
two-pass rendering for a 
proper font handling.


OK. I am confused. I thought above that font embedding worked in PS now, but
this seems to indicate that it does not.


Sorry, it also is a bit unclear to me. I think the precise status is the 
following:
1. font embedding only works with Type1 font for which a pfb file is provided 
(or also a pfa?). Subsetting --provided that this is specified by the postscript 
standard-- does not work;
2. currently only the WinAnsi charset seems to be supported. Fonts are 
systematically reencoded to this charset




I can take some of this burden off of you, in that I can hopefully fix aXSL
and FOray to provide what is needed. If that is done well, you shouldn't
need to learn too much PostScript to get it to work, and perhaps one of the
other developers can help you get it glued in. I don't know how much work it
will take for me to get the FOray PS Renderer working (it may work now), I
can use that as a test bed also.


I appreciate your offer to help! Today I quickly launched the FOray PS Renderer 
but it doesn't seem to work. I haven't investigated, though, this may be a minor 
problem.



Vincent


Re: PSDocumentGraphics2D and Font dictionary

2005-09-13 Thread Vincent Hennebert

Victor Mote a écrit :

Jeremias Maerki wrote:

output format. Maybe the Font interface should simply have a 
method to return a very generic interface for more detailed 
and font- and output-system-specific access to the font. 
Consumers of this interface can then cast it to a special 
interface/class. Something like:
TargetFormatHelper Font.getTargetFormatHelper(String mime) 
Subclasses of TargetFormatHelper could be 
PDFTargetFormatHelper or a PSTargetFormatHelper. The Font 


This is an interesting idea, but, if I understand it correctly, breaks
pluggability.


aXSL and FOray easily, because I've no direct access to the font 
files;


Which is a problem IMO. See my comments above.


I *really* don't understand this. The whole point of the font subsystem is
to hide as much detail as possible from the client application. If you want
access to the raw font data, then perhaps the FOP 0.20.5 approach is better
for what you need??!!


To go a bit along with Victor, the font subsystem should perhaps provide more 
services, depending on the client (= the type of renderer):

* a font abstraction like it is now for the layout part;
* font manipulation facilities, like e.g. embedding and subsetting for the PDF 
renderer, conversion Type1 - SVG for the SVG one, etc. In fact I would rather 
put your proposed classes at the font subsystem level.



If aXSL-font provides access to the raw underlying font 
streams, that problem basically dissolves. The following 
would certainly be no

problem:
InputStream Font.getRawStream(String part) where part may be 
pfb, pfm, afm, ttf etc.


Is this just for embedding purposes, or do you intend to parse it? If you
want to parse it, why? If all you want to do is embed it, why do you want
the metrics files? FOray essentially provides the raw font stream now. It
works for PDF, but, if I understand Vincent correctly, does not work for PS.
So how does this method you suggest help that?


See just above.



Integrating FOrayFont in the pre-release would be great...


Quite unrealistic as it stands now, sorry.


That is your (FOP's) decision, but it makes no sense to me. You are willing
to go backwards in almost any other area, but are unwilling to *not* go
forwards with PostScript font embedding? Even when it is doable?

Still, I appreciate knowing. I'll shift my focus back to getting my FOray
release out the door.


Victor, from a non-native speaker POV you seem to be a bit overreacting here. I 
have the feeling that I have misled you because of my bad understanding of the 
problem. I'm sorry if this is the case.
Jeremias has a better vision of the situation than me, and I quite agree with 
him that the integration won't be ready for the pre-release. This does not mean 
that it will never be done. And after all, all the better: we will have more 
time to discuss about a clean API.


Regards,
Vincent

P.S.: that said, the PDFRenderer should now work fine with the new font system; 
converting the SVG library should be pretty easy; this basically works for the 
AWT viewer. Nothing perfect, but... ;-)


Re: FOray contacts

2005-09-15 Thread Vincent Hennebert

I completely agree with Manuel.

Whereas I can feel your disagreement with some decisions for the project you 
have always remained nice and made valuable comments. I regret your decision to 
leave this list because you have often been helpful where you were not expected.


I'll be glad to continue to discuss with you on the FOray lists.

Cheers,
Vincent


Manuel Mall a écrit :

Victor,

On Wed, 14 Sep 2005 09:07 am, Victor Mote wrote:


FOP devs:

I think it is prudent for me to take this temporary lull to extricate
myself a bit more by unsubscribing from the fop-dev mailing list. I
have tried to do this several times before, with little success, as
you can see. I have no projects underway and no feuds to tend to ATM,
so it is a rare (unique really) opportunity.



personally I believe anyone who has worked on fop and has an active 
interest in XSL-FO can still make valuable contributions here. 

Even if you don't want to get involved in (further) design discussion 
based on events in the past I believe you still have lots to 
contribute. For example at the level of input to issues with the spec 
itself and its interpretation because of your extensive knowledge in 
that area. 


Therefore I am very sorry to see you leave this list.



Victor Mote



Regards

Manuel


2 weeks offline

2005-10-25 Thread Vincent Hennebert

Hi all,

I'll be offline from tomorrow for 2 weeks: visiting Japan.

Although I don't have had much time to work on Fop those last days I don't 
abandon my work. I've taken a little break in the adaptation to learn a bit of 
PDF. I think this is necessary to better understand what I'm doing.


The AWT renderer should now fully work. I've recently had a long discussion with 
Victor to properly handle fonts for both screen and print outputs; it is now 
possible to map the FO generic font names (serif, sans-serif, monospace) to 
either the default awt fonts (corresponding to the Lucida family), or to the 
Times/Helvetica/Courier families. This should help having the same result for 
both types of outputs.


As the RTF renderer doesn't seem to depend on the font subsystem at all, I guess 
it should work fine as well.


What's left : correct a little problem with CID fonts in the PDF renderer, adapt 
the PS renderer (which should be much easier now), adapt the SVG library for PDF 
and PS.


Cheers,
Vincent


Re: Preparing for the first release

2005-11-14 Thread Vincent Hennebert

Manuel Mall a écrit :
As the project hasn't done a release for a long time and especially no 
release of the new codebase we should test probably a bit more 
extensively than usual that the distribution builds actually are 
working and don't contain any 'cheap' errors.


To that effect I have build binary and source distributions from the 
current svn and made them available for download from 
http://people.apache.org/~manuel/fop/disttest. In the top level 
directory are the source and the java 1.4+ binary distributions. In the 
java1.3.1 directory are only binary distributions. 


I'm on a Debian GNU/Linux environment with both java 1.4.2 and java 1.5. I have 
encountered no particular problem by running the binary version on a few sample 
fo files. The source distribution also seems to build and run fine.


My 2 cents...
Vincent



Re: svn commit: r348291 - /xmlgraphics/fop/trunk/src/documentation/content/xdocs/dev/release.xml

2005-11-23 Thread Vincent Hennebert

Author: jeremias
Date: Tue Nov 22 15:34:57 2005
New Revision: 348291

URL: http://svn.apache.org/viewcvs?rev=348291view=rev
Log:
Collect places to announce FOP releases.

Modified:
xmlgraphics/fop/trunk/src/documentation/content/xdocs/dev/release.xml

Modified: xmlgraphics/fop/trunk/src/documentation/content/xdocs/dev/release.xml
URL: 
http://svn.apache.org/viewcvs/xmlgraphics/fop/trunk/src/documentation/content/xdocs/dev/release.xml?rev=348291r1=348290r2=348291view=diff
==
--- xmlgraphics/fop/trunk/src/documentation/content/xdocs/dev/release.xml 
(original)
+++ xmlgraphics/fop/trunk/src/documentation/content/xdocs/dev/release.xml Tue 
Nov 22 15:34:57 2005
@@ -75,5 +75,22 @@
 liStefan Bodewig's jump 
href=http://cvs.apache.org/~bodewig/mirror.html;Making your Downloads 
Mirrorable/jump/li
   /ul
 /section
+section id=announcements
+  titleAnnouncing the release/title
+  pHere's a collected list of places where to announce new FOP 
releases:/p
+  ul
+lifop-dev@xmlgraphics.apache.org/li
+lifop-users@xmlgraphics.apache.org/li
+ligeneral@xmlgraphics.apache.org/li
+ligeneral@xml.apache.org/li
+liannounce@apache.org/li
+li[EMAIL PROTECTED]/li
+li[EMAIL PROTECTED]/li
+lihttp://xslfo-zone.com/news/index.jsp/li
+lihttp://www.w3.org/Style/XSL//li
+lihttp://freshmeat.net/projects/fop//li
+liany others?/li


The docbook-apps@lists.oasis-open.org may be added. Although an announcement has 
already been made on this list (see the message attached).
Note that Fop 0.20.5 is still much used by Docbook users. IMO Fop 0.90 will be 
welcome there.


Vincent


 Message original 
Sujet: [docbook-apps] FOP 0.90alpha1
Date: Tue, 22 Nov 2005 14:52:57 +0100 (CET)
De: Jens Stavnstrup [EMAIL PROTECTED]
Répondre à: Jens Stavnstrup [EMAIL PROTECTED]
Pour: docbook-apps@lists.oasis-open.org

After a very long time a new FOP is being release. Be aware as the name
indicates, that this is an alpha release and may in certain areas be
inferior to FOP 0.20.5 [1]. Otherwise it is a huge step forward as the
compliance page indicates [2]. Great work foppers.

FOP 0.90alpha1 is currently being replicated to the apache mirrors.

Regards,

Jens

[1] http://xmlgraphics.apache.org/fop/trunk/upgrading.html
[2] http://xmlgraphics.apache.org/fop/compliance.html

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Text handling in svg files, transcoders

2005-11-26 Thread Vincent Hennebert

Hi Thomas,

[EMAIL PROTECTED] a écrit :
But this doesn't work when I run Fop with the same svg included in an fo 


file. 


Am I missing something?


   I take it this is an FO with inline SVG consisting of an SVG 'image'
element referencing the SVG file?


The svg file is referenced by an external-graphic element.



 With the 0.9alpha1 FOP and Batik 1.6
it turns out this won't work properly. If you bundle Batik from SVN this 
should work.


That's it. It works with the newest versions of Batik and Fop.

Thanks!

Vincent


Re: Text handling in svg files, transcoders

2005-11-28 Thread Vincent Hennebert

Jeremias Maerki a écrit :

On 25.11.2005 16:25:43 thomas.deweese wrote:
snip/


Thomas, what do you think about this topic?


  Well I think that currently the text bridges do a pretty good job
determining if they are capable of drawing text as PDF text and drop
back to curves when needed.  I would much rather work on catching cases
where this doesn't work properly than adding another option. Do you
know of cases right now where this doesn't work?



Yes, the issue when someone uses custom fonts. Text drawn as shapes uses
the AWT font subsystem to get at the fonts while text drawn as text
needs FOP's own font subsystem to select/embed the right fonts in the
target file. I assume some people would probably prefer to simply have
their things as simple as possible and not to have to manage an extra
font setup. On the other side, when Vincent is done with his work, we'll
have font auto-discovery which will improve the situation a lot.


I'm afraid of what you mean by font auto-discovery? FOray doesn't have
font discovery support. This presents some difficulties as already
discussed on this list [1].

I don't want to bring bad news, but...

Vincent


[1] http://marc.theaimsgroup.com/?l=fop-devm=111876477207479w=2



Re: Text handling in svg files, transcoders

2005-11-28 Thread Vincent Hennebert

Jeremias Maerki a écrit :

Hey, allow me some wishful thinking on my part. :-) Look, if FOrayFont
supports fonts without the need for a PFMReader or a TTFReader then the
road to font auto-discovery is just a very small step. And the former
is an absolute must. Otherwise, the whole thing is a waste of time. I'm
not even afraid about the performance penalty of an auto-discovery
feature. If someone points to the system's font directory with 500 fonts
it's his own fault if the whole thing takes a little time to preload. If
you don't do auto-discovery, that's fine. I'll do it then. I want it. :-)


Ok, then no problem ;-)
Just wanted to make sure you didn't believe that this functionality was 
already existing.


Vincent



Re: [was in Fop-user] text search in PDF

2005-11-29 Thread Vincent Hennebert

Jeremias Maerki a écrit :

Well, this has been a known issue for a long time and it is still not
adressed in FOP 0.90alpha1. However, someone is working with the FOray
project to build a better font library for FOP. Victor Mote already
fixed the problem in his FOray but I can't tell what the status of his
project is. Of course, that means there's probably no short-term
solution.


FYI: I've not investigated much concerning this problem, as I'm still
missing some knowledge. However, IIC all the needed elements should be
present in FOrayFont. When my patch is available it should be pretty
straightforward for someone who knows to implement this functionality.

Ok, should not last too long now...

Vincent



Re: MathML for fop-trunk

2005-12-09 Thread Vincent Hennebert

2.  When  rendering  MathML  we  used  awt.Font  class,  which  is not
accessible  on the Unix boxes without X-server installed. The awt.Font
class  is  used  to get font size for proper position of the upper and
lower  indexes  and  so on. Are there any chance, that we could re-use
Fop generated metrics? Where we could start look at that?



In FOP Trunk, the fonts package. Everything's there.

However, currently Vincent Hennebert works on integrating the font
subsystem from FOray which will be a little different. It may make sense
not to rush into anything too quickly here. Vincent implied that it
wouldn't take too long until he has something ready if I remember
correctly.


Right. My patch should be available in a few days now.

You may start to have a look at the aXSL web page [1], and especially
the axslFont module. This is the interface that will be available to
users of the font sub-system. I suggest you download aXSL through the
CVS repository and just build the javadoc to see what will be available
to you (the most important interfaces are Font and FontUse).

If you decide to look at aXSL and have any questions, please ask them on
the aXSL mailing-list [2], which will be more appropriate than this
list.

I'm glad to see that MathML support is coming along.
Thank you,
Vincent


[1] http://www.axsl.org
[2] http://sourceforge.net/mail/?group_id=123259


FOrayFont patch almost ready

2005-12-12 Thread Vincent Hennebert

Team,

I've just posted an updated pre-patch of my FOray adaptation work. I put
it as a pre-patch because the junit tests don't run well anymore (about
75 errors with junit-layout-standard). However, the pdf output looks
right on the few tests I have run. The weird thing is that the XML
renderer doesn't seem to get the same values for the area tree elements
as the PDF renderer.

For now I'm unable to find what's wrong. Perhaps that those who have a
better knowledge of the layout part or of the test system will be able
to give some hints? I'm going on searching but I think you can start
looking at my modifications right now. I hope to find the problem soon.

Normally the pdf, ps and AWT outputs should work well. The default
font-config file may require some adaptation, especially for the AWT
output. You can find informations in [1] and by me.

Please don't hesitate to tell me if I've done things wrong or to ask any
questions.

Thanks,
Vincent

[1] http://www.axsl.org/font/configure.html


Re: FOrayFont patch almost ready

2005-12-13 Thread Vincent Hennebert

Well, that's just what I was wondering: Bugzilla doesn't seem to have
made a notification on the fop-dev list. O_o
You may find it directly on Bugzilla, bug number 35948. There's
something I don't get: its status is still NEW but it doesn't appear
when searching for the new bugs for Fop.

Something I forgot to mention: there is still a little issue with
transcoders: there is currently no means to configure the access to the
font-config file. I'll post a note on this later today, in a dedicated
thread.

Thanks,
Vincent


Jeremias Maerki a écrit :

Uhm, cool but where did you post your pre-patch? :-)

On 12.12.2005 22:44:04 Vincent Hennebert wrote:


Team,

I've just posted an updated pre-patch of my FOray adaptation work. I put
it as a pre-patch because the junit tests don't run well anymore (about
75 errors with junit-layout-standard). However, the pdf output looks
right on the few tests I have run. The weird thing is that the XML
renderer doesn't seem to get the same values for the area tree elements
as the PDF renderer.

For now I'm unable to find what's wrong. Perhaps that those who have a
better knowledge of the layout part or of the test system will be able
to give some hints? I'm going on searching but I think you can start
looking at my modifications right now. I hope to find the problem soon.

Normally the pdf, ps and AWT outputs should work well. The default
font-config file may require some adaptation, especially for the AWT
output. You can find informations in [1] and by me.

Please don't hesitate to tell me if I've done things wrong or to ask any
questions.

Thanks,
Vincent

[1] http://www.axsl.org/font/configure.html





Jeremias Maerki





Re: DO NOT REPLY [Bug 37879] - PDF SVG rendering forces stroking text (config setting broken)

2005-12-13 Thread Vincent Hennebert

Don't know it this bug should be closed?

Vincent

[EMAIL PROTECTED] a écrit :

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=37879.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=37879





--- Additional Comments From [EMAIL PROTECTED]  2005-12-13 10:52 ---
This is broken in fop-0.90 with the provided Batik jar. It should work with the
svn versions of /both/ Fop and Batik. You may look at [1] and [2] for details. I
think this is the only solution for now if you need this functionality.

Regarding the strokeText option, it has not been implemented in the Trunk
because its usefulness is doubtful (see [2]). Text will be rendered as text
whenever possible, strokes will only be used as fallbacks.

HTH

[1] http://marc.theaimsgroup.com/?l=fop-devm=113293237123386w=2
[2] http://marc.theaimsgroup.com/?l=fop-devm=113301057529277w=2





Transcoders and font configuration

2005-12-13 Thread Vincent Hennebert

This is perhaps more just for the record.

FOrayFont needs a font-config file to run properly; the most basic file
will only contain the paths to the base14 metrics afm files. If no
config file is specified then no font will be configured, which will
obviously lead to rendering errors.

I don't think this is a problem to provide a font-config file, neither
for the command-line nor for the programmatic ways of using Fop.
However, you may not like it; if this is the case we can discuss of
possibles workarounds.

Anyway there is a case when this is kind of problematic: when the
transcoders are used through Batik to convert standalone svg files into
ps/pdf. It seems less acceptable here to ask the user to provide a
font-config file.

It should be possible to bundle the base14 afm files and a default
config file together in a resources jar. I've never played with such
things so I don't know yet how to handle them, but that should be
feasible and moreover Batik seems to already use such a mechanism.

Due to that issue transcoders are currently broken; a quick-to-implement
solution may be to set up a system property that would contain the path
to a font-config file.

Anyway, before doing anything in this area, I would like to hear the
opinion of the team. WDYT? Do you know of other possible solutions?
Thoughts, ideas?

Thanks,
Vincent


Re: Review of the FOrayFont patch and FOrayFont itself

2005-12-22 Thread Vincent Hennebert

Damn :-(

Looks like some more work is needed. Problem is that it does no longer
depend only on me.

Basically I agree with reasons 1. and 3. I don't really get the second
one, perhaps because I don't have a broad view of the problem. However
the distinction between system fonts and free-standing fonts looks clear
to me: the former are fonts handled by the Java awt system, for which
some informations may be lacking (e.g., embedding); the latter are those
for which a font file is available, and are handled externally.

Anyway, I think that there is a need for reviewing all the font stuff.
Some issues about font baselines, character selection, glyph
substitution and so on haven't been handled yet, or only partially.

I was hoping to see FOrayFont integrated as is in the trunk in a first
step, before starting to improve the font system and integrate other
functionalities. This looks like it is impossible.

This may be useful anyway to create a branch for the patch, so that
other people can have a look at it. I made the patch against revision
356368. I let you decide.

I'll spend some time now studying all the needs of a font sub-system for
a FO processor: on the layout side, regarding the different font types,
and the various renderers. I'll collect all what has already been said
on this list; I'll study the font formats in more details. I think I'll
put all that on a Wiki page, but rather perhaps in the aXSL area, don't
know yet.

This will require time, I have many things to learn; so don't expect any
concrete result before... long. Any comment or opinion is welcome.

Vincent


Jeremias Maerki a écrit :

I've applied Vincent's patch locally and went through the code. I had to
do several modifications because I've recently changed a few things in
the Trunk which broke Vincent's patch. I managed to get it to work
without bigger problems. However, I must say that I currently could not
vote +1 to apply the patch. Let me show why:

The main reason right now is lack of time. This topic will eat quite
some time as it turns out. There are several points I would want to have
improved first:

1. As I suggested earlier, having a mandatory font configuration is not
acceptable IMO. The whole thing needs to work out-of-the-box for the set
of fonts that can be considered present for each target format (Base 14
fonts for PDF and PS, AWT fonts for Java2D output etc.). To make this
work, the AFM files for the Base 14 fonts would be placed in the JAR
file. However, while going through FOray's source code I found that the
font loading code currently needs a RandomAccessFile which means that
either the file has to be first copied to the file system, a new
implementation of the RandomInput interface would have to be written for
access on an in-memory copy on the AFM files or the code has to be
rewritten to work on a simple InputStream.

2. I'm a little disappointed that Victor did not follow my ideas of font
sources, but I guess it was easier for him this way. So I can understand.
With my approach, the FontServer would hold a set of font sources each
of which provides access to a set of fonts (i.e. the AWT fonts, the
Base14 fonts, a directory of Type1 fonts etc.). The client application
would tell the FontServer (via the FontConsumer I guess) which font
sources are acceptable. For each font source an URI could be defined
that identifies it so that interoperability and extensibility are
preserved. ATM, there are only the two somewhat artificial groups: free
standing fonts and system fonts. I don't think this is flexible enough
on the long run. Anyway, the current aXSL Font API feels a little
strange still. But some of that I've already pointed out to Victor at an
earlier opportunity. The way the used fonts are stored is also a suspect
point to me since it bears the possibility of a memory-leak if a
FontConsumer is not properly unregistered in case of an exception.
Currently, the FontConsumer is not released at all. This would have to
be looked at closely. 


3. The configuration file is too complicated in my opinion (especially
the mapping part). It should be much easier. The complexity somewhat
kills the benefit of loading fonts directly, and not via XML font
metrics files. I would really, really like to have the possibility to
specify a directory and all fonts are all automatically made available.
People already have problems all the time to get their fonts working. I
would not like to see it getting even more complicated. After all, you
don't have to write a complex configuration file when you install a font
in Windows, for example. I agree that there can be sections in the
config file to specify font substitutions but the default font names
should be available automatically. While testing I had to specify
absolute paths to get the font config working quickly. In the short time,
I was looking at it I didn't manage to get it to work otherwise.

In the end, I have to look at the cost/benefit ratio:
+ ToUnicode fix (also 

Re: Combine FOP PDFBox efforts?

2006-03-15 Thread Vincent Hennebert

Hi Ben, hi All,

I finally have some time to chime in, sorry for the delay. Thank you for
your interest in the font subsystem.

My goal is to adapt the FOrayFont library to Fop. The main advantage of
FOrayFont over the Fop code is its ability to directly parse font files,
whereas currently with Fop there is a two-step process: first convert
the font metrics into an xml file, then use it within Fop through a
configuration file. You can have the process in [1].

I've submitted a first patch in december [2], that was refused because
of unacceptable shortcomings of FOrayFont. The main reasons were:
* lack of a default config file;
* configuration too complicated.
You will find all the details in [3]. Since that I'm working with Victor
on FOrayFont's improvement. We have recently ended the design phase and
have agreed on a set of changes that I still have to apply (you will
find the discussion on the FOray-dev mailing list archive from the last
two months. I'll add more on this on FOray-dev.). After that I believe
that the main shortcomings will be corrected and that an updated patch
can be submitted.

PDFBox is pretty independant of my work. I currently rely entirely on
the Fop PDF library for PDF outputs, and I'm only adapting necessary
things to make it use FOrayFont. FOrayFont is a low-level library that
tries to be independent of any output format, and thus may be used by
whatever renderer. So if PDFBox were to be used by Fop, for me it would
just mean that I would have to adapt PDFBox instead of the Fop library.

For FontBox this is different, and I think there is a possibility to
share resources in this area. I'll put more details on FOray-dev, but in
short it would be great if we could achieve the following:
* merge the best of FontBox and FOrayFont to obtain a good font library;
* agree on a common interface (i.e., an API) for the font library, that
  would be used conjointly by Fop, PDFBox and FOray;
* adapt PDFBox to make it use this resulting library;
* make it work with Fop in some manner.

I would like to work with you on the two first points. As you have
probably already noticed the discussion will be mainly held in the FOray
area. We will chime in here for Fop-specific things and to notify Fop
devs of advancements of the adaptation work.

I'm glad to see that there is place for collaboration. I'm sure that we
will be able to achieve Great Things ;-)

Cheers,
Vincent


Current way to configure fonts in Fop:
[1] http://xmlgraphics.apache.org/fop/trunk/fonts.html

Patch for the adaptation of FOrayFont to Fop (now outdated):
[2] http://issues.apache.org/bugzilla/show_bug.cgi?id=35948

Reasons of the patch refusal:
[3] 
http://mail-archives.apache.org/mod_mbox/xmlgraphics-fop-dev/200512.mbox/browser




Ben Litchfield a écrit :

Jeremias,

I'll start by answering your questions

1)What is minimum JDK required by PDFBox?

PDFBox currently requires 1.4, because it uses ImageIO and a couple 
other things that make development much easier.  PDFBox was compatible 
with 1.3 for a long time, but I made a decision that sticking with 1.3 
would cost too much in development time versus using existing stuff in 
1.4.  In addition 1.3 is now two major versions old and in the EOL 
phase.  As this effort will take some time before it could be released 
would it be reasonable to move the minimum requirement up to 1.4 for 
Batik and FOP at that time?


2)Does PDFBox require log4j?

PDFBox used to be dependent on log4j, 0.7.2 has an optional dependency, 
the soon to be released 0.7.3 version will not use log4j at all.
Currently PDFBox's only dependency is FontBox(see comments below), 
although bouncy castle will soon become an optional dependency for 
certificate based encryption and rhino(looks like Batik uses this as 
well) will also be an optional dependency for Javascript execution.



Some additional comments,
*After the 0.7.2 release, PDFBox split the font infrastructure into 
another project, so aptly named FontBox.  No official version has been 
released yet but the project was created and all font parsing logic was 
separated from PDFBox.  As far as I can tell there is no open source 
font library and for many of the same reasons we have discussed I 
thought it would be better as a separate project.  It sounds like there 
has already been some discussion on making a separate font library 
project, I would be happy to collaborate on and donate what little font 
parsing code I have to that project.  It only makes sense for 
PDFBox/FOP/Batik/... to all use a single font library.  It is starting 
to sound like a unified font system might be the first task.


*I did not realize that other projects(Batik) were using FOP's pdf 
library, again a separate PDFFont library makes that cleaner.  As a 
side note, PDFs can contain SVG graphics, so I eventually saw PDFBox 
utilizing Batik, which makes things interesting :)


*If bringing PDFBox into ASF is what is necessary to make this work than 
I am willing to do that.  

Re: Combine FOP PDFBox efforts?

2006-03-16 Thread Vincent Hennebert
Great, I will start updating PDFBox to use the FOrayFont, I believe 
this will go pretty smoothly because FOrayFont is already being used 
for PDF creation.  More details on the FOray list.


We have had some recent discussions about supported JRE's, from the 
main page of FOray[1] it says that 1.4 is used.  There is a desire 
among the FOP developers to maintain compatibility with 1.3.  Do you 
know if FOrayFont compatible with 1.3?


Actually I haven't taken care of this issue yet. I'm hoping that it
won't be too difficult to make it 1.3-compliant, we only use basic
classes of the standard library. My goal is to first have it accepted in
Fop, and then do what is necessary to achieve 1.3 compliance (actually,
if someone else would volounteer to take care of that last step, even
better ;-) )

Vincent


Re: Google Summer of Code

2006-04-19 Thread Vincent Hennebert

Jeremias Maerki a écrit :

I don't see why you couldn't also apply for a slot. Having floats would
be cool (but not simple). I'm not sure about your work on FOrayFont. I
think the projects have to be tied to one of the organizations listed by
Google. Only the client part in FOP would be part of that. So floats
would be the better option I guess.


By re-reading the FAQ I noticed that the work we do for a project to
which we are already contributing must be new. So it's best that I apply
for floats.

I can't really estimate the amount of work needed to implement floats,
besides the documentation phase. My feeling is that it shouldn't be too
difficult to implement before-floats, as there already is a number of
papers dealing with this quite well-known issue. Side-floats should be
much trickier, as they would interfere with line-breaking, and the CSS
model for them looks rather complicated.
We could propose before-floats as a main goal, plus basic side-float
implementation and optional refinements.



BTW, both of you should make sure you're eligible for participating in
the SoC. See the Students FAQ on the Google SoC site. Someone who
applied last year wasn't a student.


As a PhD student I'm eligible, no problem.


Vincent


Re: Google Summer of Code

2006-04-23 Thread Vincent Hennebert

Hi,

I'm thinking about setting up a page on the FOP-wiki where I would put 
up the goals for my proposal. That way I could give the link when 
submitting my application.


Nice idea, indeed. I've taken the liberty of implementing it and created
a new Wiki page for the SoC, linked from the DeveloperPages [1]. The
idea is to have a main page listing the various proposals, with a link
to each proposal's dedicated page. You'll just have to add yours.

I've put a first draft regarding float implementation. Any comment is
welcome, especially regarding scheduling (I've no idea of the time each
task may need - never done project estimations!), and typos. Thanks.

Vincent

[1] http://wiki.apache.org/xmlgraphics-fop/GoogleSummerOfCode2006


Re: Google Summer of Code

2006-04-24 Thread Vincent Hennebert

Hi Jeremias,

Thanks for your review!


Vincent, a comment on yours: For before-floats, you refer to best-fit
and first-fit approaches. I'm not sure if it's really relevant here. If
I'm not mistaken before-floats are pretty similar to footnotes which
means you can probably take a lot from there. I think that first-fit
only really helps when you get to side-floats.


Well, my understanding of before-floats makes me think that they may
benefit from a total-fit algorithm. WRT footnotes, the difference is
that a footnote must appear on the same page as its reference; this is a
constraint that doesn't exist for before-floats, although it is best to
place them as near their reference as possible, of course.

When refering to first-fit/total-fit algorithms I had the paper
Pagination Reconsidered referenced on the Wiki [1] in mind. To
summarize quickly it is stated that placement of floats benefits from a
better algorithm than first-fit, quite like for the computation of page
breaks. In fact LaTeX uses a first-fit algorithm, which often leads to
floats placed pages far from their reference. So I had the feeling that
a total-fit algorithm could be applied to floats as well as to page
breaks.



I think that user-configuration for best/first-fit doesn't help much. I
rather think that FOP will have to find out itself whether it has to
abandon best-fit so it can handle more complex features, i.e. switching
to first-fit if it finds one of the following conditions:
- side-float in the flow
- page-masters with different available IPD in the flow
- tables with auto table layout (when individual column widths are
requested for each page)


That may be discussed. I remember of discussions on this list about the
cost of a total-fit algorithm regarding time and memory consumption.
IIRC there was stated that such an algorithm should be optional for
rendering documents that don't need it, like invoices. That's why I was
thinking of a config option.

Regarding Fop automatically switching to the most appropriate algorithm,
depending on the situation, I'm afraid of the complexity of such a
behavior. That may be difficult to estimate which strategy is to be
adopted, and when it should be decided that a given strategy has failed
and that it should be switched to another one. But I may be wrong.

Vincent


[1] http://wiki.apache.org/xmlgraphics-fop/LiteratureLinks


Re: Google Summer of Code

2006-05-07 Thread Vincent Hennebert

Hi Jeremias,

Normally my application should have been sent. However I don't see it on
my home page so I wonder if it has to be reviewed first or if the submit
has failed. The way the webapp works isn't clear to me yet. Do you see
my application?

Vincent


Jeremias Maerki a écrit :

Hi Patrick,

green lights everywhere. I've already ranked it and applied for the
mentor position. It turns out that the Wiki in the ASF this time only
was about collecting ideas for projects (unlike last year). The real
project proposals are those on the Google site. At this moment, only 7
entries (including yours) have requests for the mentor position und
therefore have a score of =4.

Now we only need Vincent who should be back on Sunday. If he enters his
proposal then, everything should be fine. Vincent, ping me when you
entered it so I can go right to the ranking part.

On 05.05.2006 18:27:50 Patrick Paul wrote:


Hi Jeremias,

Any news from GSoC ? Were you able to see my application and rank it ? 
How are things going within the ASF ?


Patrick

Jeremias Maerki wrote:



It looks like you two need to sign up for the GSoC directly on the
website and enter your application there based on the info on the ASF
Wiki. This needs to be completed until May 8. After that the
applications will be rated by the mentors. To me this was all a bit
confusing despite the extensive FAQs. Realized only today that I have to
sign up, too, as a mentor.

http://code.google.com/

On 28.04.2006 17:24:43 Patrick Paul wrote:




Same here, I've added an entry for auto table layout, right after Vincents.

Patrick

Vincent Hennebert wrote:

  




Ok, I've added an entry for floats implementation.

I'll be off-line from tomorrow until the 6th of may. Just hoping no
particular problem will occur during my absence. Anyway, if I don't
answer mails that's just normal.


Vincent



Jeremias Maerki a écrit :






Yes, please add entries for the two projects. That Wiki page is the
first station. I'm not sure, yet, who exactly will transfer the
proposals to Google (can't remember from last year, either), but the
first step is to identify the projects inside the ASF. I'll need to read
through the whole mentoring info and stuff during the weekend. You can
list me as a mentor on both projects. If I can get help from another FOP
committer mentoring, all the better.

On 27.04.2006 16:22:44 Patrick Paul wrote:

  




Jeremias,

Do you think we should have our two projects posted on the Apache 
Wiki ?


http://wiki.apache.org/general/SummerOfCode2006

Is it better to go through that page, or will our proposals be 
forwarded directly to you anyway ?


Thanks,

Patrick










Jeremias Maerki









Jeremias Maerki



Re: Google Summer of Code

2006-05-07 Thread Vincent Hennebert

Ok, this should have worked this time. Don't know what happened.

Vincent


Jeremias Maerki a écrit :

Vincent, I'm afraid I don't see your application, either. You could
simply try again or contact the GSoC support.

On 07.05.2006 19:12:26 Vincent Hennebert wrote:


Hi Jeremias,

Normally my application should have been sent. However I don't see it on
my home page so I wonder if it has to be reviewed first or if the submit
has failed. The way the webapp works isn't clear to me yet. Do you see
my application?




Jeremias Maerki



Re: svn commit: r406917 - in /xmlgraphics/fop/trunk: src/java/org/apache/fop/fonts/truetype/TTFFile.java status.xml

2006-05-25 Thread Vincent Hennebert

snip/

Also do we know what Foray Fonts does?



Prefers the OS/2.sTypeAscender over hhea.Ascender just like FOP did
before my patch.


FWIW: I've just noticed Victor of the issue. Just waiting for his
comments.

BTW, my work on fonts (which already wasn't going very fast) may further
suffer from my participation in the SoC. If anyone wants to take the
lead...

Vincent


Re: Google Summer of Code

2006-05-29 Thread Vincent Hennebert

Hi Simon,

Simon Pepping a écrit :

I have winters of code and summers of less code. That diminishes my
ability to provide assistance.

What kind of work and responsibilities would assistance include? What
would be the time constraints, that is, how fast needs a review be
done or a question be answered? And in which period is the work done?


The work begins right now and ends on August 21. There is a mid-program
evaluation on June 26.

As far as I'm concerned, I'm planning to put my thoughts about the
project on a dedicated Wiki page. I think that just providing high-level
comments on them would already be of great help: am I going into the
right direction? Am I running into a dead-end? Have I missed something
important? In particular, I'll be playing with Knuth's glue/box/penalty
model, and as you seem to have good knowledge of it I would be glad to
hear from you about that.

Anyway, whatever you'll be able to do will be fine. Thank you for your
interest!

Vincent


Re: Google Summer of Code

2006-05-30 Thread Vincent Hennebert

Hi Simon,

Simon Pepping a écrit :

I have one small comment on your decomposition of the line breaking
algorithm:



  * defining a somewhat arbitrary formula used to compute the demerit of each 
break, and which is to be minimized;



I find the above second item in this list a bit misplaced. It is part
of the definition of the algorithm rather than its actions.


You're right. What I wanted to do is extract the three most important
aspects which IMO characterize the algorithm. I've rephrased the text to
make it clearer.



Regarding the last list, I am not sure what you mean by 'a floating
sequence of g/b/p items'. A subsequence whose position in the large
sequence is floating?


Exactly. This subsequence represents before-floats and footnotes.
Footnotes are a bit more constrained as they should appear (at least
partly) on the same page as the footnote-citation.


Thank you,
Vincent



Re: [GSoC] Wiki page for progress informations

2006-05-30 Thread Vincent Hennebert

Hi Jeremias,

Jeremias Maerki a écrit :

did you already investigate how footnotes are implemented? Can you say
anything about how similar the problem of footnotes is to before-floats?
Just so you don't have to start from scratch while there may be
something to build upon. After all, the footnotes also contain some
logic to move certain parts to a different page than where anchor is
located.


I'm certainly planning to look at how footnotes are implemented. There
will probably be things to share in this area. My feeling right now is
that floats may be easier to deal with as they are not required to
appear on the same page as their citation.



Another thing that we may need to keep in mind: There was lots of desire
from the user community that FOP supports large documents (long-term
goal, not necessary yours). I wrote that a first-fit algorithm could
help free memory earlier. Obviously, for complex before-float situations
a total-fit approach is probably more interesting as it can come up with
more creative solutions. I'm just mentioning it so we keep the bigger
picture in mind and since there could be conflicting goals.


Actually it is stated in the project's goals that two algorithms be
implemented: a quick, memory-friendly one (first/best fit) and a
high-quality, slow, memory-consuming one (total fit). It seems that
best-fit will be similar to first-fit in terms of process- and
memory-consumption, yet better in quality. But this has still to be
investigated. Also, some work might be shared with Patrick as the
page-breaking algorithm will affect automatic layout of tables, as far
as I understand.

Vincent



Re: [GSoC] Wiki page for progress informations

2006-05-31 Thread Vincent Hennebert

Thanks a lot Luca! This will help me find my way in the code. I keep
your comments in mind for when I better understand the whole issue.

Vincent


Luca Furini a écrit :

Jeremias Maerki wrote:


did you already investigate how footnotes are implemented? Can you say
anything about how similar the problem of footnotes is to before-floats?
Just so you don't have to start from scratch while there may be
something to build upon. After all, the footnotes also contain some
logic to move certain parts to a different page than where anchor is
located.



A few quick comments about the footnote implementation:

1) the FootnoteLM returns only the sequence of elements representing the 
inline part (not the footnote-body part); it just adds to the last 
(inline) box a reference to the FootnoteBodyLM.


2) the LineLM, after computing the breaks, adds to each (block) box 
representing a line the references to the FootnoteBodyLM whose citations 
are in that line


3) during the remaining of the element collection phase, these 
references are not used (but in the creation of combined element 
lists, when they should be copied inside the new elements)


4) the PageSequenceLM.PageBreaker.getNextKnuthElements() method, after 
receiving all the (block) elements, scans them looking for footnote 
information, gets the elements from the referenced FootnoteBodyLM and 
puts them in a different list (at the moment a list of lists, but this 
is sub-optimal), and from the footnote-separator (in a separate list)


5) these lists are looked at in 
PageBreakingAlgorithm.computeDifference(), where we try to add some 
footnote content to the normal page content using getFootnoteSplit(), 
and in computeDemerits(), where some extra demerits are added if we 
break a footnote or some footnotes are deferred.


This last point at the moment is performed using many 
PageBreakingAlgorithm private variables, which is maybe not the best way 
to do it, as we must be very careful about their initialization and 
their use, especially when the algorithm restarts. I think that a 
state object storing these variables could be used to store these 
values, and explicitly passed along the methods instead of relying on 
the class members, but concerning this I'd like to hear the opinions of 
the other committers ...


Insertion of before-floats could be implemented in a similar way, giving 
the precedence to the footnote insertion (as it is affected by more 
strict constraints).


An important difference between a footnote and a before-float is that 
the latter does not have an inline part, so (if we want to follow the 
same pattern) we need to either store the reference inside a 
previously-created box or to add some new elements containing the 
reference (but we must be sure that these elements cannot be parted from 
the previous ones, see the constraints in section 6.10.2 in the spec).


A crucial point is the demerit function as, if I remember correctly, it 
greatly affect the computational complexity of the breaking algorithm 
(thre should be a M. Plass paper concerning this).


HTH


Another thing that we may need to keep in mind: There was lots of desire
from the user community that FOP supports large documents (long-term
goal, not necessary yours). I wrote that a first-fit algorithm could
help free memory earlier. Obviously, for complex before-float situations
a total-fit approach is probably more interesting as it can come up with
more creative solutions. I'm just mentioning it so we keep the bigger
picture in mind and since there could be conflicting goals.



A first degree of first-fit algorithm could be achieved quite quickly 
by having a BreakingAlgorithm interface which is implemented by a 
TotalFitBA (the existing implementation) and a FirstFitBA which would 
have a much simpler considerLegalBreak() method that, instead of the 
complex set of nodes, just keeps in mind a single node.


This would surely decrease the memory footprint, but is not (I think) 
what we really want, as this simplified algorithm would be performed on 
the whole sequence of elements.


In order to start processing the sequence as soon as we receive a few 
elements we need to do some deeper changes.


An idea (I just had it now, so I did not fully consider all its 
implications).
At the moment, the block-level LM collect elements from their children 
and return just a single sequence (if there are no break conditions); we 
could have a parameter requesting them to return after they receive each 
child sub-sequence, and have a canStartComputingBreak() method that 
returns true if the sequence contains enough elements and we are using a 
first-fit algorithm, or false otherwise ...


Sorry for the long post ... and for the long absence too, but it seems 
that just after thinking great, now I've really got some time to spend 
on FOP I receive tons of other things to do ... :-(


Regards
Luca




Plass' thesis: Optimal Pagination Techniques...

2006-06-01 Thread Vincent Hennebert

Hi all,

A bit more than one year ago Plass' thesis was cited on this list [1].
By looking at the 24 Page Preview available on the site it seems that
this works may help me to have a better understanding of the issue, and
solves some of the problems I raised on the Wiki.

I would like to have the confirmation by those (namely Jeremias) who
have bought this book if it is worth its price (35$ for an image-only
pdf file... a bit expensive, IMO). So?

Thanks,
Vincent

[1] 
http://mail-archives.apache.org/mod_mbox/xmlgraphics-fop-dev/200503.mbox/[EMAIL PROTECTED]


Re: [Xmlgraphics-fop Wiki] Update of GoogleSummerOfCode2006/FloatsImplementationProgress by VincentHennebert

2006-06-11 Thread Vincent Hennebert

I've looked at all the pages in the section Documents with Relation to
the Knuth Approach of DeveloperPages, which provide a good starting
point. I don't think there are other pages elsewhere?

Vincent


Simon Pepping a écrit :

Vincent,

Are you aware that Luca has documented his implementation on the Wiki?

Simon

On Fri, Jun 09, 2006 at 09:56:44AM +0200, Jeremias Maerki wrote:


Vincent,

don't hesitate to send patches if you want to clean up and better
document the code. I've recently wondered if it made sense to split
PageBreakingAlgorithm into two classes, for example, a
VerticalBreakingAlgorithm (base class and used by static-content and
block-container) and a PageBreakingAlgorithm (which adds footnote and
before-float functionality). That may make the code clearer.

On 07.06.2006 19:34:56 Apache Wiki wrote:
snip/


+ == The Fop Source Code ==
+ Even if it is well explained, the Knuth line-breaking algorithm isn't
that easy to understand. ATM I've concentrated on the class
layoutmgr.BreakingAlgorithm, which contains the part of the algorithm
which is common to page- and line-breaking. It is splitted in parts
which follow pretty closely those described in Digital Typography. It
relies on the following skeleton:{{{


snip/

Jeremias Maerki






Re: DO NOT REPLY [Bug 39777] - [PATCH] GSoC: floats implementation

2006-06-13 Thread Vincent Hennebert

Hi,


--- Additional Comments From [EMAIL PROTECTED]  2006-06-12 12:45 ---
(In reply to comment #0)


This patch isn't really meant to be applied... Rather to be reviewed by
interested parties to check if I'm not wrong. Changelog:
* javadocs for the Knuth line- and page-breaking algorithms. Some items are
marked with double question marks because I haven't found out yet what is their
purpose. I will probably find eventually, but if anybody has immediate hints
they will be welcome.



KnuthBlockBox:
bpdim seems to be used in concert with the proprietary display-align=fill
value Luca implemented. See AbstractBreaker.optimizeLineLength(). If I
understand it right it is somehow used to make sure all the pages have more or
less the same amount of content (in bpd).


OK. Actually this is the natural width (without stretching nor
shrinking) of the line represented by this box. This field apparently
exists because it isn't possible to get the min/opt/max values stored in
a MinOptMax object. Otherwise it could be retrieved from the opt of the
ipdRange field. It may perhaps be useful to add such methods to
MinOptMax?



pos and bAux are defined in ListElement/KnuthElement.


Hmmm. Does a Position object represent the index of the Knuth element
(here KnuthBlockBox) in the sequence managed by the corresponding LM?
What does it mean that a box is auxiliary?



BreakingAlgorithm:
alignment: EN_BEFORE is not used. EN_START is used instead, since the class is
used in both ipd and bpd. EN_BEFORE is mapped into EN_START. Actually, alignment
uses a slightly different set of value than the FO properties, so reusing the
integer constants may not be the best thing, but we're not under Java 5, yet,
where we could use enums.
bFirst is used for the text-indent property so only the first paragraph of a
block is indented. See block_text-indent.xml.


You probably mean the first /line/ of a block?



partOverflowRecovery is used in page breaking to defer an element which would
overflow the available BPD to the next page if it's the only element in a part
(=line or page).


I'm still a bit unsure of what 'part' means in the javadoc of
BreakingAlgorithm.isPartOverflowRecoveryActivated. A line/page? A
word/block? A Knuth box?



* some methods have been marked deprecated because AFAICT they are not called
anywhere. If this is agreed I'll remove them in my next patch



+1



* bugfix? In the last for loop of the method
layoutmgr.PageBreakingAlgorithm.noBreakBetween I think the exit condition should
be a strict comparison ('' instead of '='). Confirmation?



not sure. :-(


The code in the for loop checks if the element pointed to by index is a
legal breakpoint. If the exit comparison isn't strict index may reach
the value breakIndex, which by definition is a legal break. I guess the
purpose of the noBreakBetween method is to check that there is no legal
break between the two given breakpoints, /excluded/. The line
storedValue = (index == breakIndex)
would confirm that.






* the javadoc comments for some methods have been removed because they will
inherit them from their super-class



I think Checkstyle will bark about that. If you do Ctrl-J in Eclipse, you get an
automatic @see entry which satisfies Checkstyle. @inheritDoc does not work in
every Java version.


In fact checkstyle doesn't complain. It seems to be smart enough to
detect that there is a javadoc for the original version of a redefined
method. In such cases javadoc copies the definition from the
super-class, and that's also what Eclipse does in the tooltip. I may put
@see statements, but I think it doesn't really make sense.






* some checkstyle fixes



HTH


Updated patch follows.
Thanks,
Vincent


Re: keep...=always and Knuth penalties

2006-06-19 Thread Vincent Hennebert

FYI: I'm planning to refactor the breaking algorithm in order to
implement floats. I'll see what can be done in this area. Just keep in
touch.

Vincent


Manuel Mall a écrit :

On Monday 19 June 2006 16:45, Jeremias Maerki wrote:


On 18.06.2006 20:57:51 Simon Pepping wrote:


On Sun, Jun 18, 2006 at 07:36:45PM +0800, Manuel Mall wrote:


snip/


Or should we use a more refined approach were we generate
initially an INFINITE penalty but if the page breaking cannot
find a solution we reduce the penalty on some/all of those
elements given an INFINITE penalty because of keeps and run the
page breaker again?


I am in favor of this solution. There are generally two solutions:
increase the tolerance, or force a solution. I think FOP already
has a force parameter for this purpose.


+1. Yes, BreakingAlgorithm has a force parameter which is currently
set to true for page breaking. There's also a threshold. We can
probably play with that first. See
LineLayoutManager.findOptimalBreakPoints().




Yes, there is a force parameter and it seems to be always set to true 
for page breaking (and false for line breaking). But it doesn't seem to 
guarantee that breaks will be found otherwise we shouldn't get 
the giving up after 50 retries message.


Anyone who understands how this force parameter is suppose to work?



Jeremias Maerki



Manuel




[GSoC] How the work should progress

2006-06-21 Thread Vincent Hennebert

Hi all,

I'd like to have the opinion from the team about how I should proceed.
I'm currently at a point where I think I know enough, both from
theoretical and code points of vue, to start the implementation of
floats. By mimicing the handling of footnotes, I think I can have a
working implementation rather quickly and easily. However, it wouldn't
be very satisfying IMO. Some refactoring wouldn't be useless, and while
I'm at it, why not doing it completely?

I've already spent much time figuring out how the code is working. From
what I've seen, some areas of the code still look experimental. I think
the implementation of floats may be an opportunity to bring it to a more
polished level. A refactoring would have several benefits:
- this may help sorting things out, and even prepare the implementation
  of a first-fit algorithm (although this might be a bit too much
  unrelated, I'm afraid)
- this may help future contributors to easier understand this area of
  the code and get involved more quickly
- this is always better to have a clean design. Moreover, I think this
  is possible to make the implementation even more object-oriented,
  which would help sharing code between the line and page levels.
- a refactoring process is more efficient and secure if one has the
  opportunity to think full-time about it...

That's why I would propose to refactor the breaking algorithm. However,
to do things properly I would need to understand a bit more of the code
than just the breaking stuff. This may take some time, especially if I
want to make sure that I don't introduce new errors. The implementation
of side-floats may suffer from that. That was not the original intent of
the SoC project, but I think this would be a benefit for Fop.

WDYT?
Vincent


Re: [GSoC] How the work should progress

2006-06-23 Thread Vincent Hennebert

Chris, Simon,

Thanks for your feedback. It seems that I've missed Jeremias before he
goes in Ireland. Too bad that my mail took half an hour to arrive.

Anyway, I have work to do on my thesis during a few days. Hopefully
Jeremias will be able to look at his mails during the conference. If
not, I'll follow Simon's advice and refactor just what is necessary to
implement floats.

Cheers,
Vincent


Simon Pepping a écrit :

On Wed, Jun 21, 2006 at 04:32:29PM +0200, Vincent Hennebert wrote:


Hi all,

I'd like to have the opinion from the team about how I should proceed.
I'm currently at a point where I think I know enough, both from
theoretical and code points of vue, to start the implementation of
floats. By mimicing the handling of footnotes, I think I can have a
working implementation rather quickly and easily. However, it wouldn't
be very satisfying IMO. Some refactoring wouldn't be useless, and while
I'm at it, why not doing it completely?



Be careful. Don't let yourself be sidetracked by other worthy
objectives. Stay focused on your target; it will be difficult enough.
 


I've already spent much time figuring out how the code is working. From
what I've seen, some areas of the code still look experimental. I think
the implementation of floats may be an opportunity to bring it to a more
polished level. A refactoring would have several benefits:
- this may help sorting things out, and even prepare the implementation
 of a first-fit algorithm (although this might be a bit too much
 unrelated, I'm afraid)



I would hope that a best-fit algorithm can be inserted.

Regards, Simon






Re: [GSoC] How the work should progress

2006-06-26 Thread Vincent Hennebert

Hi Jeremias,

snip/

Please do try to refactor the footnote and before-float stuff out into a
separate class to make the whole design clearer. But don't shift your
focus too much. Some factoring: +1, total refactoring -0.5, keep focus
on your task: +1. ;-)


Ok. So definitely I'll just refactor what is necessary to cleanly
implement before-floats. That said, some further refactoring might be
needed for side-floats... But we'll see at that moment.

Thanks,
Vincent


PercentBaseContext uselessly inherited

2006-06-29 Thread Vincent Hennebert

Hi all,

I've noticed that many *LayoutManager classes explicitly implement the
datatype.PercentBaseContext interface while it is already extended by
the LayoutManager interface. Same for the BlockLevel- and
InlineLevelLayoutManager interfaces.

All those classes or interfaces, which implement or extend
LayoutManager, implicitly also implement/extend the PercentBaseContext
interface. Thus there is no need that they themselves implement/extend
PercentBaseContext.

If this is agreed I'll remove the unnecessary extends/implements
statements in my next patch.

Or have I missed something?

Vincent


Re: PercentBaseContext uselessly inherited

2006-06-29 Thread Vincent Hennebert

Hi all,

I've noticed that many *LayoutManager classes explicitly implement the
datatype.PercentBaseContext interface while it is already extended by
the LayoutManager interface. Same for the BlockLevel- and
InlineLevelLayoutManager interfaces.

All those classes or interfaces, which implement or extend
LayoutManager, implicitly also implement/extend the PercentBaseContext
interface. Thus there is no need that they themselves implement/extend
PercentBaseContext.

If this is agreed I'll remove the unnecessary extends/implements
statements in my next patch.

Or have I missed something?


Ok, I /have/ missed something. I should perhaps have taken a nap before
writing that. I let myself getting confused by the way the javadoc
displays informations. For a class it gives all of the implemented
interfaces, even those which are only indirectly inherited.

Sorry for the noise :-/

Vincent


Re: Error message: Should be first

2006-07-10 Thread Vincent Hennebert

One of my clients reported to me that he gets a Should be first
error
message on the log. This happens in (Page)BreakingAlgorithm.removeNode().
I get the impression that the code there is not finished rather than
that is a real error condition. I'll try to extend removeNode() so it
really removes the disabled node.

See the attached demo file (You'll need italian hyphenation available to
get the error).

I'll try to fix that tomorrow. If Luca or anyone else has any further comments
on that, I'd appreciate it.


I don't have any error with this file. However, I've had to change the
font names because i_helvetica is unknown. I've tried with Helvetica and
Helvetica + font-style=italic (as I suppose is what the i_ means)
but I still don't get any error. How did you get it?

Regarding the should be first error, that's a part of the algorithm
I don't completely understand, yet. That said, the removeNode method is
called, among other places, in filterActiveNodes. This is only a guess,
but if there is a place where the removeNode's precondition isn't
respected, that might be here.

HTH,
Vincent


Necessary conditions to defer footnotes

2006-07-11 Thread Vincent Hennebert

Hi All,

there is something I don't get with the handling of footnotes. When
there is not enough room on the current page to place all the footnotes,
the algorithm tries to find a place where to split them. But there is a
condition: it must be possible to defer old footnotes
(PageBreakingAlgorithm, l.332). And this is possible only if there is no
legal breakpoint between the previous active node and the currently
considered breakpoint (checkCanDeferOldFootnotes method). I don't
understand this latter condition?

And, reading the code, I don't understand if this method's purpose is to
determine if it is /allowable/ to defer footnotes (am I authorized to
defer footnotes if any), or if it is /possible/ (are there footnotes to
defer). Ok, this is a bit subtile, but understanding that would help me
get the intent of the algorithm.

What is the relation with before-floats? Well, I'm currently refactoring
this part of the code to factorize out as much as possible things common
to floats and footnotes. And this part of the code, currently applied to
footnotes, may well be also applied to floats.

Hints?
Vincent


[GSoC] BreakingAlgorithm: simplify handling of activeLines

2006-07-17 Thread Vincent Hennebert

Hi All,

Good news: before-floats are working. There probably are bugs and place
for improvement but I think it is time to submit a first patch, so that
you may see what I've done.

I'm currently cleaning up and documenting my code, and I think the
handling of the activeLines array may be simplified: currently, for a
line l, activeLines[2*l] points to the first active node for this line,
and activeLines[2*l+1] points to the last node. But the last node is
never directly accessed, only by starting at the first one and following
the links.

There must be a reason for this code but I don't see it. Perhaps this is
related to some older code which since was removed? Or have I missed
something? However, if it is ok I'll simplify that in my patch.


Vincent


Re: [GSoC] BreakingAlgorithm: simplify handling of activeLines

2006-07-18 Thread Vincent Hennebert

 Hi All,

 Good news: before-floats are working. There probably are bugs and place
 for improvement but I think it is time to submit a first patch, so that
 you may see what I've done.

 I'm currently cleaning up and documenting my code, and I think the
 handling of the activeLines array may be simplified: currently, for a
 line l, activeLines[2*l] points to the first active node for this line,
 and activeLines[2*l+1] points to the last node. But the last node is
 never directly accessed, only by starting at the first one and following
 the links.

Perhaps I misunderstand your question, but I think the last active node
in a line is used when adding yet another active node for that line at
the end of the linked list. In BreakingAlgorithm:addNode():

activeLines[headIdx + 1].next = node;


Ah yes, I get it now, thanks Finn. In fact this is to have the insertion
of a new node in constant time. Grrr, should have found it out by
myself.



On the other hand, a different data structure of nodes might very well
open up different improvement. The current structure of using a linked
list for each line, is just the best I could come up with at the time.


I think the structure is fine; I was about to propose to just switch to
Java Collections, as Simon said he did in his implementation, but I
think I'll leave it as is for now. As Simon's work will probably be
eventually integrated, this will be an opportunity of refactoring (BTW,
your work may help me implement side-floats; I'll have a closer look at
it once I'm done with before-floats).


Vincent


Two days offline

2006-07-22 Thread Vincent Hennebert
I personally would gladly work, but my brain no longer wants to. Guess I 
need a break.


Vincent


Re: DO NOT REPLY [Bug 39777] - [PATCH] GSoC: floats implementation

2006-07-25 Thread Vincent Hennebert

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
http://issues.apache.org/bugzilla/show_bug.cgi?id=39777.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=39777





--- Additional Comments From [EMAIL PROTECTED]  2006-07-23 19:47 ---
I finally had a chance to take a first look. What I've seen so far looks pretty
nice. A first simple test behaved as would be expected. Making a before-float
too big to fit on a page (although there are break points inside the content)
results in an OutOfMemoryError (probably due to an infinite loop).

It would be good if you would write a set of test cases for before-floats as
your next task. This is to document what works and what doesn't and to give you
and us more confidence when doing further chances in the code.


Testcases should be ready tomorrow.



Finally, would you compile a list of classes you propose to move into the
breaking package? The idea itself is worth investigating since the layoutmgr
package has already grown rather big again.


On a general matter, I would put in the breaking subpackage all of the
classes from layoutmgr and its subpackages which are related to the
Knuth approach: the algorithm as well as the various Knuth elements. A
quick look gave me the following classes:

AbstractBreaker
BalancingColumnBreakingAlgorithm
BreakingAlgorithm
BlockKnuthSequence
InlineKnuthSequence
KnuthBlockBox
KnuthBox
KnuthElement
KnuthGlue
KnuthPenalty
KnuthPossPosIter
KnuthSequence
PageBreakingAlgorithm
inline.KnuthInlineBox

This will probably create many access problems (with currently
package-private members). But this will be an opportunity to clean up
the whole thing a bit, I think.

In a second step there is also a number of inner classes which might be
extracted and transformed into a top-level class of the new breaking
subpackage. I'm mainly thinking of
inline.LineLayoutManager.LineBreakingAlgorithm. I guess there are
reasons why they currently are inner classes, but it may be conceptually
better anyway to separate them from their surrounding class. This would
have to be studied more deeply.

As for when to apply the change, we may probably wait for the
integration of Simon's work. I created the package just because I had a
new class to put in it and thought that I might as well directly put it
in the right package.


So, WDYT?



What we need to decide now is whether to put the changes in a branch until they
stabilize or if we put it in Trunk. I'd prefer a branch for now so in case I
have time to finish the work on 0.93, we don't have any problems from this end.


A branch would be fine I think, as this would allow me to submit more
gradual patches.


Vincent


Re: DO NOT REPLY [Bug 39777] - [PATCH] GSoC: floats implementation

2006-07-31 Thread Vincent Hennebert

I managed to reenable one of the disabled test cases because you were fooled by
the default values for widows and orphans. Having only 3 lines in a block does
not allow any break possibilities with default widows and orphans. 4 lines
creates one break possibility in the middle.


Indeed, yes. Well, thanks!

Vincent


space-start and space-end for block-areas

2006-07-31 Thread Vincent Hennebert

Hi all,

I think there is a problem in the spec regarding the space-start and
space-end traits for block-areas. The like-named properties only apply
to inline-level formatting objects, so I guess that for block-areas
those traits are indirectly-derived from other properties (start-indent
and margin-*). The problem is that this is explained nowhere in the spec
how to compute those traits, although I guess they should be given the
values of the corresponding margin properties.

Let me remind the three points of the spec which IMO are involved here:
1. in section 4.4.1, Stacked Block-areas, there is a rule, among
  others, stating that for a block-area:
the start-edge of its allocation-rectangle is parallel to the
start-edge of the content-rectangle of R (where R is the closest
ancestor reference-area of B), and offset from it inward by a
distance equal to the block-area's start-indent plus its
start-intrusion-adjustment (as defined below), minus its
border-start, padding-start, and space-start value

2. in section 5.3.2, Margin, Space and Indent Properties, there are
  rules for computing the values of start- and end-indent when margin-*
  is specified, and vice-versa.

3. in section 5.1.2, Computed Values, there is the following:
Specifying a value for one property determines both a computed
value for the specified property and a computed value for the
corresponding property.
  So in our situation, if margin-* is not specified then it must be
  computed from the (possibly inherited) specified value of
  start-indent.
  Here I disagree (but I may be wrong) with the IndentInheritance Wiki
  page, where it is written (e.g., in the first example) that the rules
  from 5.3.2 are not always triggered. In my opinion they are,
  otherwise we can't compute the value of the space-start trait (more
  below).


Now, let's consider the following problem:
- writing-mode is lr-tb, reference-orientation is 0 (most common case in
 western countries);
- we have an fo:block;
- we want to compute the offset of the start-edge of the generated
 block-areas's content-rectangle from the start-edge of the closest
 ancestor reference-area's content-rectangle.

Let's assume that:
- xa is the x-coordinate of the start edge of the block-area's
 allocation rectangle;
- xc is the x-coordinate of the start edge of its content rectangle;
- the origin of the coordinate system is the start-edge of the
 content-rectangle of the closest ancestor reference-area;
- there is no side-float so we can forget the start-intrusion-adjustment
 in the formulae.

Then we have the following:
   (1) xa = xc - start-indent (definition of allocation-rectangle in 4.2.3)
   (2) xa = start-indent - border-start - padding-start - space-start
(section 4.4.1)
That gives us the offset of the block-area's content rectangle:
   (3) xc = 2*start-indent - border-start - padding-start - space-start

If margin-left is set on the fo:block (and assuming that the inherited
value of start-indent is 0), start-indent is computed like that:
   start-indent = margin-left + padding-left + border-left-width
which gives us for xc:
   xc = 2*margin-left + padding-left + border-left-width - space-start
This corresponds to the intuitive understanding provided that
space-start is set to margin-left.

If margin-left isn't specified but start-indent is, then the definition
of the start-indent property (§7.10.7) lets us expect that
   xc = start-indent
So the formula (3) becomes:
   start-indent = 2*start-indent - border-start - padding-start - space-start
which works only if
   space-start = start-indent - border-start - padding-start
Again, this is ok if we give to space-start the value of margin-left
computed according to section 5.3.2.

Thus I think me may assume that the space-start and space-end traits for
block-areas are given the computed values of the corresponding margin
properties.

Now there is a problem if the inherited value of start-indent is not 0.
Then the value of space-start becomes (3rd formula of section 5.3.2):
   space-start = start-indent - inherited-start-indent - padding-left
- border-left-width
so
   xc = start-indent + inherited-start-indent
which breaks the expected interpretation of the start-indent property in
§7.10.7

Applied to the following example, taken from the IndentInheritance Wiki
page:
   fo:flow
 fo:block start-indent=10ptindented text
fo:blocknested block/fo:block
 /fo:block
   /fo:flow
For the nested block, start-indent is set to the inherited value of
10pt. margin-left is computed according to the formula of section 5.3.2:
   margin-left = start-indent - inherited(start-indent) - 0 - 0
= 10pt - 10pt - 0 - 0
= 0
So space-start = 0, and xc = 2*start-indent - 0 - 0 -0 = 20pt, which is
not the expected value.

I think the inherited value of start-indent should be removed from the
third formula of section 5.3.2. Note that this 

Re: space-start and space-end for block-areas

2006-08-01 Thread Vincent Hennebert

Really want to dig that one out again? :-)


He he ;-) I guess yes.

I was starting to look at all the intrusion-adjustment and -displace
stuff when I stumbled upon this issue. I need to have an absolutely
clear understanding of that if I want to implement side-floats
correctly.



Before I go into details inline below, let me stress that the margin-*
properties are defined in XSL-FO for compatibility with CSS. They don't


rant And that complicates things because we have to adapt the wordings
of one spec to another one. And this is all the more difficult as there
are always uncertainties in specs. They should be written in some kind
of formal language IMHO. /rant


play a direct role in the FO geometry. 5.3.2 simply tells us how to map
the margin-* properties to start/end-indent. Furthermore, I think you
are mixing property evaluation (refinement stage) with the area model
(layout stage). 4.4.1 is about the area model and includes stuff like
instrusion-adjustment which is absolutely no topic at property handling
level (chapter 4 vs. chapter 5). So you can't just mix equations from
chapter 4 and 5.


Well, I'm aware of the difference between properties and traits, but I
was trying to find from which properties the space-start trait may be
derived for block-areas. Because the space-start property only applies
to inline-level formatting objects.


snip/

 Now, let's consider the following problem:
 - writing-mode is lr-tb, reference-orientation is 0 (most common case in
   western countries);
 - we have an fo:block;
 - we want to compute the offset of the start-edge of the generated
   block-areas's content-rectangle from the start-edge of the closest
   ancestor reference-area's content-rectangle.

 Let's assume that:
 - xa is the x-coordinate of the start edge of the block-area's
   allocation rectangle;
 - xc is the x-coordinate of the start edge of its content rectangle;
 - the origin of the coordinate system is the start-edge of the
   content-rectangle of the closest ancestor reference-area;
 - there is no side-float so we can forget the start-intrusion-adjustment
   in the formulae.

 Then we have the following:
 (1) xa = xc - start-indent (definition of allocation-rectangle in 4.2.3)

with no intrusion adjustment, also: xc = 0 + start-indent (7.10.7)

(inserting xc in (1)) -- xa = 0

7.10.7 makes a statement for the area model.


Ok.



 (2) xa = start-indent - border-start - padding-start - space-start
 (section 4.4.1)

With no intrusion adjustment:
(J1) space-start = start-indent - border-start - padding-start
(variables are traits here)


Ah yes, so this formula comes from the statement in 7.10.7 (BTW, in case
of mixed writing modes / reference-orientations this statement is wrong;
I thought they introduced the allocation-rectangle just for dealing with
that).
But then, the formula of 4.4.1 may be (greatly) simplified:
   xa = start-intrusion-adjustment
If it is so complicated, is it because the formula (J1) may not always
be true?



And that's not the same as 5.3.2:
margin-left = start-indent - inherited(start-indent) - padding-left
   - border-left-width
(variables are properties here)
More on that below.

 That gives us the offset of the block-area's content rectangle:
 (3) xc = 2*start-indent - border-start - padding-start - space-start

 If margin-left is set on the fo:block (and assuming that the inherited
 value of start-indent is 0), start-indent is computed like that:
 start-indent = margin-left + padding-left + border-left-width
 which gives us for xc:
 xc = 2*margin-left + padding-left + border-left-width - space-start
 This corresponds to the intuitive understanding provided that
 space-start is set to margin-left.

 If margin-left isn't specified but start-indent is, then the definition
 of the start-indent property (§7.10.7) lets us expect that
 xc = start-indent
 So the formula (3) becomes:
 start-indent = 2*start-indent - border-start - padding-start - space-start
 which works only if
 space-start = start-indent - border-start - padding-start
 Again, this is ok if we give to space-start the value of margin-left
 computed according to section 5.3.2.

 Thus I think me may assume that the space-start and space-end traits for
 block-areas are given the computed values of the corresponding margin
 properties.

 Now there is a problem if the inherited value of start-indent is not 0.
 Then the value of space-start becomes (3rd formula of section 5.3.2):
 space-start = start-indent - inherited-start-indent - padding-left
 - border-left-width

No, 5.3.2 says:

margin-left = start-indent - inherited(start-indent) - padding-left
   - border-left-width

margin-left != space-start !!!
space-start depends on the intrusion-adjustment. margin-left does not.


I don't think so, as there is a start-intrusion-adjustment trait for
that. But I'm ok with the rest.


snip/

 So, I've the feeling that on this issue the spec is both incomplete
 (how
 to compute 

Re: space-start and space-end for block-areas

2006-08-01 Thread Vincent Hennebert

snip/

 Ah yes, so this formula comes from the statement in 7.10.7 (BTW, in
 case
 of mixed writing modes / reference-orientations this statement is wrong;

I don't think so. FOs like block-container create a viewport/reference
pair. The viewport-area does the rotation, the reference-area is already
in the same rotation as its immediate children.


Well, again I may be wrong, but: in section 4.2.3 we have the following:
the content-rectangle of an area uses the inline-progression-direction
and block-progression-direction of that area; but the border-rectangle,
padding-rectangle, and allocation-rectangle use the directions of its
parent area. Thus the edges designated for the content-rectangle may not
correspond to the same-named edges on the padding-, border-, and
allocation-rectangles.

So if we want to put, say, some Japanese in an English text, the main
flow would be in lr-tb writing-mode, and we would put an
fo:block-container with a tb-rl writing-mode. The block-container would
generate a viewport-reference pair of areas with the following mapping:

   border-rectangle, padding-rectangle, allocation-rectangle
before-edgetop
after-edge bottom
start-edge left
end-edge   right
   content-rectangle
before-edgeright
after-edge left
start-edge top
end-edge   bottom

Note that in section 4.2.2, it is stated that the start-edge and
end-edge of the content-rectangle of [the reference-area] are parallel
to the start-edge and end-edge of the content-rectangle of [the
viewport-area].

If we set a start-indent for the fo:block-container, this would be the
space between the left-edge (start-edge) of the flow's content-rectangle
and the left-edge (after-edge) of the block-container's
content-rectangle.

WDYT?



 I thought they introduced the allocation-rectangle just for dealing with
 that).
 But then, the formula of 4.4.1 may be (greatly) simplified:
 xa = start-intrusion-adjustment
 If it is so complicated, is it because the formula (J1) may not always
 be true?

I don't think it's that complicated. The spec just tries to explain the
relationships of the various properties and traits. I don't think all of
these statements are meant to be used as literal formulas. I think we
rarely, if at all, have to deal with the space-start trait, for example.
Right now, we shift in by start-indent and out again by padding-start
and border-start. That's all that's necessary to paint everything. Of


Of course, but independantly of how this is actually implemented, I'd
like to make sure I understand the formulae rightly.


snip/

 Well, I think I get it now. It's frustrating to spend so much time
 in
 just trying to understand a spec...

Imagine how I felt in the first half of 2005 while getting up to speed
with the mean details of the spec. It's just normal you feel that way.
If you look in the archives, you'll see that you're not alone. A feeble
consolation, I know.


Yes, but still one. This seems to confirm that the problem relies in the
spec, not in my brain ;-)



 Thank you, Jeremias.
 Vincent


 PS: There seems to be a problem, then, with the third paragraph of the
 attached fo file. IIUC it should be placed 1 cm right from the black
 border. And if I remove the start-indent=0 attribute from the fo:block
 it should be placed 2 cm right. WDYT?

Yes, that's what's happening and what should be happening. I don't see
the problem, sorry.


Well, with my working copy I get the following results:
When start-indent is explicitly set to 0cm for the third paragraph,
the text is placed 1 cm /left/ from the black border:
http://atvaark.dyndns.org/~vincent/ref-area_start-indent-0cm.pdf

When start-indent is unset, the text is placed 2 cm left from the black
border:
http://atvaark.dyndns.org/~vincent/ref-area_start-indent-none.pdf

Same result with Fop 0.92beta. My working copy is up-to-date with the
repository and contains no local modification. I wonder where is the
problem. May I ask which results do others get?


Thanks,
Vincent


Re: Implementing OpenType font support, how hard?

2006-08-02 Thread Vincent Hennebert

Hi Bertrand,

As I've made some work in this area, I can provide a few additional
hints. In fact I'm kind of a bridge between FOray and Fop and am
working on adapting FOrayFont to Fop. Currently I'm not doing much
because I'm busy with some other work on Fop for the Google Summer of
Code, but I should have some time again to work on this from
mid-september on (mmmh, is it too late for you?).


snip/

 2d) Re-enable kerning, as OpenType fonts are usually of high quality
 and deserve to be used with automatic kerning.

Ok, that should be obsolete.

One point about 2) is that Vincent Hennebert and Victor Mote are working
on FOrayFont to create a better font library which we'd like to use when
it's finished. So this may mean that some of this work would better be
done in/for ForayFont. Finishing 2) would then also mean finishing
FOrayFont to the degree that it can be used in FOP. I guess that will
need further deliberation.


(Some quick background on this: I submitted a patch in december 2005
which integrated FOrayFont into Fop; it was not applied because of too
severe limitations of FOrayFont. I'm currently working on implementing
the missing features; there are still two of them to implement, and then
the adaptation may be restarted. Of course the Fop code has quite
evolved since december.)

I believe there is basic OpenType support in FOrayFont, I can't say more
without having a deeper look. AFAICT there are two areas of work:
* complete the parsing of OpenType font files;
* make sure the API provides access to the most advanced features of
 OpenType fonts.

The FOray project:
http://foray.sourceforge.net/
You may be interested in subscribing to the foray-dev list.



 3) Additional steps for OpenType GSUB table support
 The goal is to enable the smart font features of OpenType, automatic
 ligatures as mentioned above, language-dependent glyph substitutions
 (different shapes if a letter is at the beginning of a word for
 example), automatic decorative swashes at the beginning or end of
 words, etc.

 3a) Decode the GSUB table of the OpenType font (and other tables that
 might be required to use it) and store its data in the FOP XML font
 metrics file

One goal of FOrayFont is to make the separate metrics file obsolete. The
font files will be interpreted directly. This should also simplify the
whole system, especially for the user.

 3b) Modify the chars-to-metrics mapping to handle things like
 automatic ligatures, where several chars map to a single glyph

Here I think you can profit from my work on kerning to handle special
cases.


The only problem I see with ligatures is when a word may be hyphenated
between two characters for which there is a ligature: if it ends up
being hyphenated the separate glyphs should be used, otherwise the
ligature glyph should be used. I don't think this can be easily
represented in the current Knuth glue/box/penalty model which is used to
break lines into paragraphs.
I also believe that in German (and perhaps other languages), ligatures
are not welcome in some compound words.
And I wonder if the ligature mechanism is similar for non-western
languages which heavily use them (Arabic, for example).



 3c) Implement GSUB table handling, glyph substitutions (or reuse an
 existing library for this, but the only one that I've found is
 freetype, haven't found one in Java).

 3d) Create test documents to demonstrate this, asking a font provider
 for a donation of some OpenType fonts to use in FOP tests.

That's one possibility. Another one might be the DejaVu fonts which we
have found after a LOT of searching for a font with an ASF-compatible
license. However, I haven't received any official feedback on license
compatibility, yet. OTOH, I'm not sure if those fonts will enable you to
show off all the features you want to implement.


Aren't DejaVu fonts only TrueType fonts?



 Even this wouldn't be complete, as OpenType allows specific features
 to be enabled for specific character runs, like use alternate glyph
 set 2 for this character only. But it would be a good start already
 ;-)

:-) Sure.

 At this point I'm mostly interested in your opinion on points 1) and
 2) above, if these enhancements seem realistic I might be able to work
 on them in my current project. Point 3) obviously needs more work and
 might not fit my budget at this point.

In general, I'm happy if we get some reinforcements on the font front.
2) shouldn't be a very big task. But I assume the whole FOrayFont thing
might make this a little more complicated.

AFAIK, OpenType allows different variants of a font in one font file
(ex. normal and bold). We've had requests to support those font files.
Have you found out during your investigations what would be involved in
supporting this and would this be in scope for your work? So far, I've
been unable to find out how this is handled.

 Thanks for any feedback on this!
 -Bertrand

 [1] http://xmlgraphics.apache.org/fop/0.20.5/fonts.html#embedding
 [2] http

start-indent for line-areas

2006-08-03 Thread Vincent Hennebert

Hi All,

Hem, once again :-\

In section 4.5 of the spec it is written that, for a line-area, the
start-edge of its allocation-rectangle is offset from the start-edge of
the content-rectangle of the nearest ancestor reference-area by the sum
of its start-indent and start-intrusion-adjustment.

The start- and end-edges of the allocation-rectangle are the same,
whichever value the line-stacking-strategy trait takes.

A line-area is a block-area, so the start-edge of its
allocation-rectangle extends outside the content-rectangle by
start-indent.

Thus the x-coordinate of the content-rectangle is 2*start-indent +
start-intrusion-adjustment?! Obviously this is wrong.

I guess it should better... no, I don't guess anything. What have I
missed?

Thanks,
Vincent


Re: start-indent for line-areas

2006-08-04 Thread Vincent Hennebert

A line-area is a special sort of block-area (4.5, 1st sentence), it
does not have any border and padding. Furthermore, 4.4 defines the
behaviour of block-areas and makes special comments that many of those
feature don't apply to block-areas which are line-areas (for example for
start/end-indent).


Hmmm, reading and re-reading the spec I find nothing about that. Section
4.4 says that a block-area which is not a line-area must be properly
stacked. So that holds for a block-area with line-area children. Which
let me think that the stacking rules of 4.4.1 apply to line-areas.
I mean, in the given description B may be a line-area.



So, I'm not sure where you got your 2*start-indent from, but I think


A line-area being a block-area, x_content-rectangle =
x_allocation-rectangle + start-indent. Section 4.5 says that
x_allocation-rectangle = start-indent + start-intrusion-adjustment. So
x_content-rectangle = 2*start-indent + start-intrusion-adjustment.

It may be that the start-indent of a line-area is not equal to the
start-indent of its parent block-area. But then I don't know how it is
supposed to be computed.

It may be that for line-areas, the allocation-rectangle should rather be
the border-rectangle (and, then, also the content-rectangle since
line-areas have no border nor padding). The definition of the
allocation-rectangle for a line-area in section 4.5 would then be
consistent, the line-area's rectangle would coincide (when there is no
intrusion) with the parent's content-rectangle in the i-p-d. This would
correspond to what you said just below:

you may not involve start/end-indent with line-areas. AFAIU,
line-areas
all extend to the edges of the parent content-rectangle in
inline-progress-direction (i.e. start and end) if there's no instrusion.



Or perhaps this definition is wrong and the start-edge of the
allocation-rectangle should coincide with the start-edge of the ancestor
ref-area's content-rectangle (when there is no intrusion). Like for
other block-areas, in fact.

I think I'll go with the second possibility. Of course, I guess the
allocation-rectangle does not appear in the code but this is to be sure
placements will be rightly computed.



Does that help?


Yes thanks,
Vincent


Re: [Xmlgraphics-fop Wiki] Update of GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingSideFloats by VincentHennebert

2006-08-11 Thread Vincent Hennebert

2006/8/11, Apache Wiki:

Dear Wiki user,

You have subscribed to a wiki page or wiki category on Xmlgraphics-fop Wiki 
for change notification.

The following page has been changed by VincentHennebert:
http://wiki.apache.org/xmlgraphics-fop/GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingSideFloats

The comment on the change is:
Difficulties around side-floats


Ok guys, I've been searching for 3 days for a simple, elegant, powerful,
effective, ligthweight solution to make the total-fit algorithm work
with side-floats, but can't seem to find one. This issue is somewhat
related to the one regarding differing available ipd in page sequences
(solving one might help solve the other one, at least). There is also
some similarity with tables, i.e., how to combine several vertical
Knuth element lists into just one. Excepted that in the case of
side-floats, we may choose several completely different solutions to
place them. And this is a case-by-case decision: one time this will be
better to differ the float, one other time to break it, one other time
to compress it...

In some situations, a best-fit approach could even produce better
results, as we would have the possibility to consider the differing of a
side-float. But it is well-known that best-fit may be much worse than
total-fit regarding before-floats and footnote placements.

Looking at how tables are handled might give me some ideas for
side-floats, but this would require some time and there isn't much left
now.

I also have some ideas for improving the handling of before-floats and
footnotes, and I'd like to implement them while I have time and it's
still fresh in my memory.

The implementation I propose on the Wiki page has its limitations but
should work in most cases. This might give a basis for further
improvements.

I'm thinking about making a poll on fop-user to know what are their
expectations regarding side-floats, and which usage they would make of
them. This might help make some design decisions.

Well, in one word, I'm a bit lost as for what to do, now.

WDYT?
Vincent


Re: [Xmlgraphics-fop Wiki] Update of GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingSideFloats by VincentHennebert

2006-08-11 Thread Vincent Hennebert

2006/8/11, Jeremias Maerki:
snip/

 In some situations, a best-fit approach could even produce better
 results, as we would have the possibility to consider the differing of a
 side-float. But it is well-known that best-fit may be much worse than
 total-fit regarding before-floats and footnote placements.

Are you sure that it's much worse? Note that even TeX uses best-fit
for page breaking.


Oh yes I'm sure ;-) It's a common complaint from users of the LaTeX
world that figures are placed in a weird manner. There are plenty of
parameters to tweak the output by hand, but this doesn't always give
satisfying results. And given that manual intervention isn't really an
option in the FO world...
Note also that TeX implemented best-fit for page-breaking only because
of limited computation resources at that time (you know, those good old
80's).



 Looking at how tables are handled might give me some ideas for
 side-floats, but this would require some time and there isn't much left
 now.

 I also have some ideas for improving the handling of before-floats and
 footnotes, and I'd like to implement them while I have time and it's
 still fresh in my memory.

 The implementation I propose on the Wiki page has its limitations but
 should work in most cases. This might give a basis for further
 improvements.

 I'm thinking about making a poll on fop-user to know what are their
 expectations regarding side-floats, and which usage they would make of
 them. This might help make some design decisions.

That should certainly provide some interesting feedback.

 Well, in one word, I'm a bit lost as for what to do, now.

Hmm, yeah, it's a bit difficult. Best-fit vs. total-fit plays into this.
Best-fit will most likely replace total-fit because of the additional
features we can cover. If that means some drawbacks on things like
footnote placement that could be acceptable if the drawbacks are not too
great. ATM, I cannot estimate the impact of the change. Too bad we don't


Well, to me switching to best-fit is out of the question. Total-fit is a
killer feature for technical documents with lots of figures and
footnotes (the current implementation is already better than Xep...).
This would give Fop a big advantage over concurrent implementations.

Rather, we might consider implementing both strategies. Ideally there
would be an option in the config file ( optimize-for-books vs
optimize-for-fancy-layouts or something like that). The generation of
Knuth elements would be the same, only the breaking algorithm would
differ. But abandoning total-fit is not an option, IMO. Moreover...


have much theoretical reference material on using the Knuth approach on
page breaking. Most of what exists today around page breaking is by M.F.
Plass or worked out by us on our Wiki.


... I'm still hoping to find a solution compatible with the total-fit
approach. The glue/box/penalty model is simply not powerful enough to
represent tables, side-floats and the like. All we have to do is find
another model...



I would not consider it a bad move if you concentrated the rest of your
time of the GSoC on the before-floats and footnotes, especially if it's
unlikely that you can finish side-floats in time and if the switch from
total-fit to best-fit hangs in the air. Even without an actual
implementation the groundwork for a side-float implementation in form of
some very good documentation is a very satisfying result. I would
consider the goals of the GSoC project met. It might be good to add some
best-fit-specific comments, though.


Ok, will do.



How does that sound?


Well, now I know what to do ;-) I'll first work on footnotes and
before-floats improvements. I'll also try to implement the simplified
solution for side-floats. Depending on the results I'll write the
suitable documentation.


Thanks!
Vincent


Re: [Xmlgraphics-fop Wiki] Update of GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingSideFloats by VincentHennebert

2006-08-15 Thread Vincent Hennebert

Hi Simon,


Vincent,

This page represents a good piece of work.

First some nit picking regarding language:

'A, if X; B, else': change else - otherwise (2x)

'We may choose to either differ a side-float, or ...': change differ - defer


Thanks. Changed.



Then a comment on the rules:

Rule 1. Why does the rule not require not both x = 0 and x + ipd =
ipd(ref-area), for both start and end floats, unless the float is
wider than the ipd(ref-area)? In other words, why is rule 7 not
required for any start and end floats?


Hey, you're right. Ok, rule 1 is correct: a start-float may not stick
out of the start-edge of the ref-area, period. The constraint on the
opposite side is given by rule 7, which actually is badly phrased. More
precisely, the formal wording is not equivalent to the loose wording
given between parentheses. The loose wording is quite clear and
intuitive; the formal wording forgets the case of a float alone: it's
not because it is alone that it may stick out of the ref-area.

Well, now that you've pointed it out this is pretty obvious... I've
reformulated this rule according this time to the loose wording. Tell me
if you don't agree.



In 'Properties of the model': I do not see that rule 7 is satisfied.


A start-float begins at the end-edge of the ref-area and is pulled along
this edge (which is like a wall). So by nature it may not stick out of
this edge.

What the illustration attempts to show is that if previous start-floats
occupy too much place, then the new start-float will strike against
their after-edges (the start-guide) without being able to go beforer.



Finally some thoughts on a possible algorithm:

The algorithm should combine pagebreak and linebreak calculations in a
single dynamic calculation:


Yes, I think we should try to find something like that.



for each legal pagebreak
  for each active pagebreak node
layout the page and calculate its demerits

Laying out the page involves breaking each paragraph on the page into
lines; each legal linebreak/active linebreak node combination (that
is, each iteration in the two nested loops of the linebreak
calculation) is associated with a certain side float layout, and thus
the line widths for that case are known and the demerits can be
calculated.


One issue is that some legal pagebreaks are unknown until paragraphs are
laid out (because of widows/orphans, for example). So the for each
legal pagebreak is not that simple and might involve some backtracking.



It is just a rough idea. I have no clear picture what the linebreak
calculation in combination with side floats looks like. I just have
the feeling that it should be possible in principle. Obviously, it may


Yes. I think the biggest problem is to decide when to defer a float or
not. Otherwise, as new floats have no impact on already computed lines
the current line-breaking algorithm won't have to be too much reworked.

Of course, I'm not speaking of values other than line for the
intrusion-displace property. I don't even want to think about the other
values...



be necessary to break a paragraph into lines several times, each time
with a different side float layout. It will be necessary to store
earlier linebreak calculations for a paragraph in a clever way, so as
to avoid unnecessary recalculations for identical linewidths. Working
this out into a realistic algorithm requires much more thinking.

Regards, Simon


Thanks for your comments, Simon.

Vincent


Re: [Xmlgraphics-fop Wiki] Update of GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingSideFloats by VincentHennebert

2006-08-15 Thread Vincent Hennebert

2006/8/15, Simon Pepping:

Hi,

One more thing. All your beautiful pictures are on your own web
site. Would you mind if we copy the pictures to the home page of one
of the committers on people.apache.org, and change the links on the
Wiki page? That is a more permanent solution.


I fully agree. We may perhaps just wait for the end of the GSoC, as I
might create new ones or rework some until that.

Vincent


Re: Some comments on improving the algorithm for before-floats

2006-08-16 Thread Vincent Hennebert

Hi Simon,

Ok, I've taken out my LaTeX book again to be sure I understand you.


Vincent,

Your proposal to improve the algorithm for the placement of footnotes
and before-floats sounds fine. A few comments.

'Ideally there would be a configuration setting telling which ratio of
the page should be filled with normal content; if this ratio is null
then pages only made of out-of-line objects would be allowed.' I think
this may be split into several configuration settings:
- The minimum amount of normal content on a page.


OK. This corresponds to the \textfraction parameter, right?



- Whether float pages are allowed. Even when the minimum amount is not
  zero, the user may set this to true.


OK. ...mmmh, found no dedicated LaTeX parameter for that.
\floatpagefraction=0?



- The minimum amount of float content on a float page before it may be
  considered feasible. Only relying on the normal demerits calculation
  for the stretch or shrink may be too restrictive.


Moreover, if the figures are made of images, there is likely to be few
shrink/stretch.
This is also the \floatpagefraction parameter? Actually I don't really
understand this parameter. At least, I don't understand its interest:
this means that underfull float-only pages are acceptable? This looks
weird to me.

But as it would be easy to implement, I can do it. Related question:
would footnotes be allowed on float-only pages, or only before-floats?
This may be useful for books with many many footnotes. But for other
books this can look weird. WDYT? Another config parameter?



In fact, these are configuration parameters in LaTeX.

Regarding the demerits for deferred out-of-line objects, a simple
multiplication with the page difference produces a linear
relation. This may be too weak, and a squared or steeper relation may
be preferable.


No. Period.

Ok, some explanations ;-) This would break the property of optimal
substructure which makes the dynamic programming approach work. In his
thesis, Plass proved that using a squared function leads to an
NP-complete problem. In Pagination Reconsidered, Brüggeman-Klein et
al. showed that using a linear function is nearer to a human's feelings,
is solvable by dynamic programming, and gives satisfying results. So
I think we may go with it.



Regards, Simon


Thank you,
Vincent


Re: [Xmlgraphics-fop Wiki] Update of GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingSideFloats by VincentHennebert

2006-08-18 Thread Vincent Hennebert

Hi Simon,

2006/8/17, Simon Pepping:

 Rule 1. Why does the rule not require not both x = 0 and x + ipd =
 ipd(ref-area), for both start and end floats, unless the float is
 wider than the ipd(ref-area)? In other words, why is rule 7 not
 required for any start and end floats?

 Hey, you're right. Ok, rule 1 is correct: a start-float may not stick
 out of the start-edge of the ref-area, period. The constraint on the
 opposite side is given by rule 7, which actually is badly phrased. More
 precisely, the formal wording is not equivalent to the loose wording
 given between parentheses. The loose wording is quite clear and
 intuitive; the formal wording forgets the case of a float alone: it's
 not because it is alone that it may stick out of the ref-area.

 Well, now that you've pointed it out this is pretty obvious... I've
 reformulated this rule according this time to the loose wording. Tell me
 if you don't agree.

Rule 7a is logically correct, but I would say that the rule simply
states that a start float should not stick out at the end side, even
if it is not the one that is flush with the start side. Then x + ipd
= ipd(ref-area) follows even without the condition ipd =
ipd(ref-area).


Hmmm. If the ipd of a float is greater than ipd(ref-area), then it /is/
allowed to stick out at one side (end-side for start-floats, start-side
for end-floats). On the contrary, if ipd = ipd(ref-area), then the
float is not allowed to stick out at any side. That's why there is the
condition. Don't you agree?



Rule 7b: the conclusion does not follow. The argument should be that
an end float should not stick out at the start side, so that x  =0.


Same here: x must be = 0 unless ipd  ipd(ref-area).

Re-thinking of that, I think the normative wording of rule 7 actually is
correct, even if it doesn't say exactly the same thing as the
non-normative one; when coupled with rule 9, it becomes equivalent.
I think I'm going crazy.



 In 'Properties of the model': I do not see that rule 7 is satisfied.

 A start-float begins at the end-edge of the ref-area and is pulled along
 this edge (which is like a wall). So by nature it may not stick out of
 this edge.

 What the illustration attempts to show is that if previous start-floats
 occupy too much place, then the new start-float will strike against
 their after-edges (the start-guide) without being able to go beforer.

I see now that the rules are satisfied. To show that it is only
necessary to point out that it is satisfied by the initial position,
and is not violated by subsequent movements. Whether it is stopped by
the start guide is not relevant in this argument.


Well the illustration was making sense when rule 7 was written the
previous way. Now it could well be removed... unless I rewrite rule 7 as
previously.



You say nothing about the end floats. The argument is of course the
same.


Will add a word.



 One issue is that some legal pagebreaks are unknown until paragraphs are
 laid out (because of widows/orphans, for example). So the for each
 legal pagebreak is not that simple and might involve some backtracking.

Yes, there is a problem there. The solution could be as follows: When
the legal pagebreak is in a paragraph, it is also the considered legal
linebreak. It is tested whether this linebreak could end the last line
of the page.


And deactivate the node if it turns out that this linebreak corresponds
to (e.g.) the next-to-last line of the paragraph? Hmmm, that could work.
I'll think about that.


Thanks,
Vincent


Re: [GSoC] Quick news

2006-08-22 Thread Vincent Hennebert

Ok, I'll need a couple of additional days to finish this work.

Between a research paper and the actual code there is quite a gap...
I had some hard time trying to find a proper design, dealing with
special cases while factorizing the common code. In particular, deciding
how to handle too-short/too-long nodes and the recovery mechanism wasn't
easy.

I'll be offline tonight and tomorrow, but I hope to have the patch ready
for next sunday or monday.

Cheers,
Vincent



Hi all,

The GSoC is over :-(

I wanted to submit a patch containing a full implementation of
before-floats before the end of the GSoC, but it turns out that this
was more difficult than expected. Currently there are plenty of bugs,
no line of javadoc, no comment, much cleanup to do, etc. But right now
I'm unable to do anything but going to bed. Perhaps more in a few
hours...

Cheers,
Vincent



Re: [GT2006] Registration is open! Cocoon GetTogether 2006 (Oct 2-4,Amsterdam)

2006-08-27 Thread Vincent Hennebert

He he, you can count me in guys. I'm looking forward to meet you in real life.

I should be there for both days. Are you planning to participate in
the evening events? Or do we organize our own?

Vincent


2006/8/22, Jeremias Maerki [EMAIL PROTECTED]:

I for both days. Arriving late Monday morning and leaving again Wednesday
late morning.

On 22.08.2006 20:42:02 Simon Pepping wrote:
 I registered for Monday.

 Regards, Simon

 On Mon, Aug 14, 2006 at 11:20:59AM +0200, Arje Cahn wrote:
  Hi FOP folks,
 
  If you'd like to attent the Cocoon hackaton for FOP-hacking, please use the normal registration form on 
www.cocoongt.org, deselect the Cocoon GetTogether checkbox and select both Hackaton boxes. Then, 
scroll down a little and from the Knowledge level list, select Working on a different project, here for the 
hackaton :).
 
  The hackaton-only fee is 25 Euro's per day (shouldn't be too bad I 
hope...!).
 
  Hope to see you guys in October!
 
  Regards,
  Arjé

 --
 Simon Pepping
 home page: http://www.leverkruid.eu



Jeremias Maerki




Re: FOP Poster

2006-08-28 Thread Vincent Hennebert

2006/8/28, Jeremias Maerki:

Gang,

I've finally finished (more or less anyway) the poster I plan to put up
at OpenExpo on 2006-09-20. I'd appreciate if someone could take a quick
peek and tell me if it's looking too ugly or if there are any spelling
mistakes. The logos may seem a bit dark on screen, but they look fine in
print.

http://jeremias-maerki.ch/download/fop/fop-poster.pdf

BTW, the poster is done entirely with FOP and Batik. :-)


Congratulations, very nice work. A few comments and suggestions:
- in section Output Formats: not sure, but it might be preferable to
 put a space between the dots and the or in ...or your format
- in section Foreign XML support: missing parenthesis on the first
 item
- you might want to put the very same sentence ...or your own format
 in both sections (Output Formats and Foreign XML Support). This
 might be more eye-catching
- it is best judged on the final print output, but perhaps the section
 titles in bold?
- finally, but that's nitpicking: you might want to replace the upright
 apostrophes and quotation marks by their true typographic versions
 (U+2019 for the apostrophe, U+201C, U+201D for the quotation marks).
 For a professional look...


Vincent


Re: [GSoC] Quick news

2006-08-29 Thread Vincent Hennebert

Thanks for your support, guys. I've made some progress since last
week, but there are still some bugs well hidden here and there, and
unexpected behaviors (but also some improvements, phew). For now I'll
take a little break and spend some time with my family. I should get
back to work in one week. I'll remain reachable, though.

Cheers,
Vincent


Re: svn commit: r442282 - in /xmlgraphics/fop/trunk: ./ src/documentation/content/xdocs/trunk/ src/java/org/apache/fop/fo/ src/java/org/apache/fop/fo/extensions/ src/java/org/apache/fop/fo/extensions/

2006-09-12 Thread Vincent Hennebert

2006/9/11, Jeremias Maerki:

As mentioned last month, I just changed the property names a little bit.
If anyone finds these inappropriate, please yell.


This is not a big deal, but the names bother me a bit. Reading
widow-content-limit makes me believe that that corresponds to the
maximum authorized amount for widow content, which sounds a bit
strange. What about min-widow-content?
But I wouldn't mind if you keep the current names.

Vincent


Re: [Xmlgraphics-fop Wiki] Update of PageLayout/PageAndLineBreaking by SimonPepping

2006-09-12 Thread Vincent Hennebert

Hi Simon,

I finally took the time to read and digest your Wiki page. This is an
interesting reading. A few comments:



According to that representation paragraphs with inline text have
legal linebreak points. I consider those legal linebreak points also
as legal pagebreak points. In addition, there are legal pagebreak
points between the vertical elements such as paragraphs and blocks.


One issue that will have to be addressed is that widow or
keep-with-previous settings may invalidate some previously believed
legal breakpoints. In such cases, active nodes which contain those
breakpoints in their chains will have to be deactivated; if they were
the chosen best nodes, some other nodes will somehow have to be
retrieved to replace them. I hope this won't be a too great difficulty.



Within the inner loop, we consider the page and paragraph layout
between the active and the current pagebreak point. If the active
breakpoint is within a paragraph, we calculate the best line breaks
from that breakpoint to the end of the paragraph. For all complete


Unless the current breakpoint lies in the same paragraph.



In page independent linebreaking, for each feasible breakpoint the
best node is retained, which represents the best layout of the
paragraph up to that point, and which, due to the dynamic principle,
is part of the best layout of the whole paragraph if that layout uses
this breakpoint. If line numbers matter, the best node for each line
number is retained. In page dependent linebreaking, even that is not
enough. We must retain the best node for each vertical offset on the
page, because that is the quantity that influences page breaking. This


Good point. This led me to the following thoughts:

Currently the iteration over the active nodes is broken into two loops:
one loop for iterating over the line numbers, one for iterating over the
active nodes associated to each line number. Why? Because if line widths
aren't the same they have an influence on the computation of best line
breaks. Because when considering a given legal breakpoint, we must know
the width of the line it would end in order to be able to compute the
shrinking/stretching for that line. In fact we make a distinction
between line numbers because they determine the /context/ in which
linebreaks are computed. If all the lines had the same widths such a
distinction wouldn't be necessary.

The merging of line- and page-breakings generalizes this problem of
differing contexts. This time, not only the line number counts, but also
the page number, the offset from the beginning of the page, the
out-of-lines to be placed, etc. I think the greatest challenge will be
to identify all the elements which determine that context, and to be
able to compare two contexts and say if they are equivalent or not.
Considering the case 1 you describe on the wiki page, there are only two
different contexts: page number even or odd. In this case the offset
from the beginning of the page doesn't count. In other more complex
documents this may be much more complicated.



When the linewidths depend on the page number, we need to remember the
best pagebreak node for each feasible pagebreak point for each page
number. Otherwise, we only need to remember the best pagebreak node
for each feasible pagebreak point. Note that the latter condition is
true in the presence of out-of-line elements, because those are
related to the content of the page, not the the page number.


Small typo: not the the page number



Optimization opportunity 1: We may need to reuse many times the best
layout from a breakpoint to the end of the paragraph, and the best
layout from the start of the paragraph to a breakpoint. Therefore we
need to store a reference to the best end node for either case in the
active node. If we wish to take into account the different possible
heights of the part of the paragraph, we need to store references to a
set of best end nodes. Especially for long page sequences, a page
breakpoint may be feasible both for an odd and an even page. In that
case, we need to store different end points for each, due to the
different line widths on odd and even pages. This optimization is
certainly true for the start of the paragraph. There will be many page
layouts on which the whole paragraph fits.


So, this might be handled automatically by the dynamic algorithm if we
were able to identify the different contexts.



Optimization opportunity 2: Do we need to consider each active node?
Or can we already determine that some active nodes will never give a
better layout than others? Suppose that a paragraph has two feasible
breakpoints A and B, which have an equal number of lines, or even the
same height, before and after the page breakpoint. Suppose that B has
a higher amount of demerits than A. Can we then conclude that B will
never be part of the best layout, because a better layout can always
be achieved with A? Yes, we can.


Same here, we would just have to detect that linebreak 

Re: svn commit: r446682 - in /xmlgraphics/fop/trunk/src/documentation/content/xdocs: 0.92/upgrading.xml trunk/upgrading.xml

2006-09-15 Thread Vincent Hennebert

Author: jeremias
Date: Fri Sep 15 11:53:15 2006
New Revision: 446682

URL: http://svn.apache.org/viewvc?view=revrev=446682
Log:
mention that the config file format has changed.


snip/

+  If you are using a configuration file, you have rebuild it in the 
new format. The format


A small typo: you have /to/ rebuild it...

snip/

Vincent


Re: FOP embed truetype font into postscript file [was in fop-users]

2006-10-05 Thread Vincent Hennebert
(Switching to the fop-dev list, as this discussion is becoming more and
more code-related. I suggest you to subscribe to this list if you
haven't already.)

Nguyen, Thang a écrit :
 Now I'm drown in PostScript specification :), could you tell how or where can 
 I find documents on the way FOP uses font metric file  embeded font with pdf 
 ouput, and what's the current way of FOP deals with font metric file  
 post-script output. I hope that someone can help me, it would be a lots more 
 easier than looking at the source code.

There is no particular documentation about how fonts are handled by Fop,
apart from the page explaining how to configure custom fonts [1]. So
you'll have to look at the source code. Unluckily for you, this area of
the code is undergoing some heavy refactoring, as the current font
library of Fop will soon be replaced with another one. That said, it
doesn't prevent you from starting to study how the PS renderer should be
 extended to support TrueType fonts.

For any font-related issue the PS renderer will rely on the aXSL API
[2]. So what would need to be done is to check that this API provides
all the necessary informations for embedding a TrueType font in the
rendered PS file. If so, then this is only a matter of writing the
necessary aXSL method calls within the PS renderer and putting the
needed postscript glue around the font informations.

If the API doesn't provide the necessary informations, then it will have
to be extended but we'll see that in a second step.

So I suggest you to start by having a look at the aXSL API, the code of
the PS renderer and the specifications of TrueType and Postscript. If
you have further questions then, just ask.

Vincent


[1] http://xmlgraphics.apache.org/fop/trunk/fonts.html
[2] http://www.axsl.org/font-r/

 
 thanks,
 
 Thang.
 
 -Original Message-
 From: Vincent Hennebert [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, October 05, 2006 5:06 PM
 To: fop-users@xmlgraphics.apache.org
 Subject: Re: FOP embed truetype font into postscript file
 
 Nguyen, Thang a écrit :
 Seem that a lots of works ahead :-) and do you know anything about 
 GhostScript http://www.ghostscript.com/awki , I just looked at it 
 yesterday and I hope that I can find something useful.
 
 Ghostscript is a tool that can convert postscript or pdf files into image 
 formats (PNG, JPEG), render them on screen, print them on non-postscript 
 printers, re-work them (extract pages, n-up printing...), etc. While this is 
 a very useful tool it won't interest you for that task, excepted for 
 visualizing generated PS files. It is perhaps capable of extracting useful 
 informations from postscript files, but I'm not sure.
 
 HTH,
 Vincent
 
 
 -Original Message-
 From: Jeremias Maerki [mailto:[EMAIL PROTECTED]
 Sent: Thursday, October 05, 2006 4:33 AM
 To: fop-users@xmlgraphics.apache.org
 Subject: Re: FOP embed truetype font into postscript file

 I don't have much time right now, so I can only give you the most 
 important stuff. I can give you more on the weekend, if you need more.

 The most important part is having the PostScript and TrueType 
 specifications around:

 PS spec:
 http://www.adobe.com/products/postscript/pdfs/PLRM.pdf

 Other PS-related docs:
 http://partners.adobe.com/public/developer/ps/index_specs.html

 OpenType 1.4 spec (also applies to TrueType):
 http://partners.adobe.com/public/developer/opentype/index_spec.html
 http://www.microsoft.com/typography/otspec/default.htm

 Other font-related docs:
 http://partners.adobe.com/public/developer/opentype/index.html

 I'd search for Type 42 font in the PS language reference to start 
 with.
 However, if you want to have all glyphs of a TrueType font available 
 you have to look into the CID keyed fonts direction. I don't know 
 anything about how CID handling is done in PostScript, only for PDF, 
 so for this part you're pretty much on your own for now. It's probably 
 easiest, if you try to find/produce a PostScript file from a different 
 application that can already generate the right code for handling 
 TrueType fonts so you get an idea how this is done. You might also 
 find some helpful information on the web.

 Another important hint is that some of the PostScript code is not in 
 FOP itself but in XML Graphics Commons:
 http://svn.apache.org/repos/asf/xmlgraphics/commons/trunk/src/java/org
 /a
 pache/xmlgraphics/ps/

 Most font-specific code, however, is still in FOP because we haven't 
 factored out the font library, yet. So the rest of the code will be
 here:
 http://svn.apache.org/repos/asf/xmlgraphics/fop/trunk/src/java/org/apa
 ch
 e/fop/render/ps/

 Good luck!

 On 04.10.2006 02:48:52 Nguyen, Thang wrote:
 It's great. Could you guide me where to start ?

 Thang.

 -Original Message-
 From: Jeremias Maerki [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 03, 2006 2:49 PM
 To: fop-users@xmlgraphics.apache.org
 Subject: Re: FOP embed truetype font into postscript file

Re: XSL-FO 2.0 workshop in Heidelberg next week

2006-10-10 Thread Vincent Hennebert
Jeremias Maerki a écrit :
 If anyone has any requirements for XSL-FO 2.0 which I should bring up at
 the workshop in Heidelberg next week, please let me know. Deadline
 2006-10-16 so I have time to prepare.

Jörg's comments just reminded me of something I think is missing in the
current spec:
Enable the compact box scheme specified in CSS2: if an inline box is
short enough to fit in the margin of the following block box, it is put
in the margin; otherwise, it is transformed into a block box to be put
before the following block box.

That allows to mimic the DT/DD items of HTML:

termthe definition of the term term. The definition of the term
term. The definition of the term term. The definition of the
term term. The definition of the term term.

Another term too long to fit in the margin
the definition of the too long term. The definition of the
too long term. The definition of the too long term. The
definition of the too long term.


Unless I'm wrong, I don't think this is currently possible to do that in
XSL-FO.

Thanks,
Vincent


Re: svn commit: r462814 - /xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/PDFToUnicodeCMap.java

2006-10-11 Thread Vincent Hennebert
Grmblbmlbbllbll. Forgot to change my Eclipse settings.
Sorry, won't happen again :-\

Vincent

 Author: jeremias
 Date: Wed Oct 11 07:30:34 2006
 New Revision: 462814
 
 URL: http://svn.apache.org/viewvc?view=revrev=462814
 Log:
 Tabs again. :-)



Re: [VOTE:RESULT] Vincent Hennebert as new committer

2006-10-16 Thread Vincent Hennebert
Many thanks to all of you for your support. I'm glad to enter the XML 
Graphics team. I think this is a team of really nice persons, it has 
always been great to work with you as a contributor and it will be a 
pleasure to work now as a committer.


At the GetTogether I felt what belonging to the Apache community really 
means. To me this is one of the greatest aspects of open-source 
development, and I'm proud to now be a member of such a community.


Thanks again, I'll do my best to deserve my new status!

Vincent


Jeremias Maerki a écrit :

Ok, high time to wrap this up.

We have:
10 +1
1 +0
no other votes
(6 of 8 PMC members have voted)

Vincent is now an ASF committer with write access for FOP and XML
Graphics Commons.

Congratulations, Vincent, and welcome! I'll follow up with
the administrative stuff in a second.

On 11.10.2006 10:48:29 Jeremias Maerki wrote:

Simon and I were able too meet Vincent Hennebert in person in Amsterdam
last week. He has already made an impression with his excellent work for
the GSoC. He's a very nice and intelligent guy, eager to learn and to
work on FOP. Another guy who's not afraid to jump into the depths of the
layout engine.

Since the GSoC he has found a job at Anyware
(http://www.anyware-tech.com), a French company which does a lot of
Cocoon stuff. There, he has the opportunity to work around 50% on FOP
sponsored by the same customer that enabled me to invest so much time
into FOP over the last two years.

I'd like to propose Vincent as an XML Graphics committer (FOP working
set). I believe he will be a good addition to our work force.




Jeremias Maerki


-
Apache XML Graphics Project URL: http://xmlgraphics.apache.org/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



FOrayFont integration in question

2006-11-13 Thread Vincent Hennebert
Hi all,

Sorry for the long post, but I think this is an important one.

I would like to have your feelings about the FOrayFont integration.
Since I started to work on that (in July 2005), things have quite
evolved and I'm starting to doubt that integrating FOrayFont really is a
good thing for Fop.

I've already discussed with some of you about this whole issue, but I
think it might be worth summarizing the points, and making everyone
aware of it. Because I've the feeling that whatever decision we make,
this will be a difficult one.

First, some progress informations about the integration: the PDF
renderer works now with FOrayFont, and seems to run well. The other
renderers are still to be adapted. There shouldn't be too much work for
the Java2D-based ones, a bit more for the Postscript Renderer, and also
for PCL and AFP (I can't evaluate how much there is to do for those ones
as I know nothing about those formats).
I estimate to about 5 days the amount of work to have a compilable
thing. There should be no loss of feature; there is a known problem with
the Postscript renderer (no way to know which fonts are used for a given
page, so we have to embed all of the configured fonts in the header),
but Jeremias is working on a two-pass system thanks to which this
problem should be solved soon.

For those who are not familiar with the FOrayFont architecture, here's a
quick presentation: there is a separate project called aXSL [1] (also
maintained by Victor) whose purpose is to define a standard API for
several modules related to XSL-FO. The one we're interested in is
aXSL-font, but there are also modules for dealing with graphics,
manipulating the FO tree, the area tree, etc.
The goal is to have standard interfaces shared by XSL-FO
implementations. Provided that, of course, there are more than one
implementation which implement aXSL.
So FOrayFont is a particular implementation of aXSL-font. If Fop were
using FOrayFont there would actually be almost only aXSL calls in the
code.

[1] http://www.axsl.org/

Now, let me enumerate the pros and cons of the adaptation of FOrayFont
to Fop:

Cons:
- After Bertrand's recent work on OTF support the existing font library
  is not far from being as feature-complete as FOrayFont:
  - ToUnicode support is now available;
  - it seems easy to remove the XML metrics generation step (actually
Jeremias told me he had already done it on his working copy)
- the old font support would have to be kept for use by Batik (PS  PDF
  transcoders) as the Batik people have strong feelings against external
  dependencies
- FOrayFont introduces a new font-config file which would disturb users
  (although I think it is better and more flexible than the current one)
- FOrayFont is mainly a one-man-show and it's not very good for Fop to
  have such a dependency. And as this is primarily Victor's baby we
  can't just come in and ask for write access to the code or whatever.
  We must first show that our point of view is adequate to Victor's one.
- However, it seems like we have difficulties understanding each other:
  each time I propose a change on the dev list, that triggers a lengthy
  discussion where we both try to explain our own point of vue and
  understand the other one, without even finally succeeding I think.
  There is whatever cultural gap + foreign language issue that hinders
  communication.
- As a consequence, proposing changes on the aXSL/FOray area to better
  suit our needs will require twice as much time and energy as doing
  them on our own side.
- And given that the API isn't perfect yet, I'm a bit afraid of going
  that route.
  One missing major feature for example is the ability to cache
  informations about fonts and retrieve them later; this is necessary
  for the XML area tree output or the CachedRenderPagesModel. There is
  simply no means in the API to get a font's identifier, in order to
  retrieve it later without having to re-launch the whole resolution
  process.
- during the past year, growing technical disagreements have appeared;
  if we keep working together we might end up with having a thing that
  satisfies neither of us, because of the too many compromises we would
  have to do. That ranges from programming practices to API design
  decisions.
- As far as I know, FOray has never been used in production yet, and it
  may be unstable. There are currently not many testcases and, well,
  it's already not very funny to write testcases for one's own code, if
  I have to write testcases for others...

Now for the pros:
- This would be unfortunate to break the last bridge between Victor and
  Fop.
- I've myself already done quite a bit of work on FOrayFont, which would
  be basically lost.
- Despite existing problems Victor brought quite a number of
  improvements to the font library, which would have to be re-done. And
  he started from the 0.20.5 code, like we would if we were to go our
  own way (tell me if I'm wrong, but I don't think the font code changed
 

Re: svn commit: r474387 - in /xmlgraphics/fop/trunk: ./ src/java/org/apache/fop/fonts/ src/java/org/apache/fop/fonts/truetype/ src/java/org/apache/fop/fonts/type1/

2006-11-14 Thread Vincent Hennebert
Jeremias Maerki a écrit :
 Another of my travel projects checked in: I wanted to know how easy it
 is to load fonts without the XML metrics file. As you can see from the
 amount of code, it was rather easy. Makes me wonder why we didn't do it
 earlier. :-)

Nice work, Jeremias. Well, definitely an additional point for FOPFont.


 Please note that the existing functionality is still fully there. The
 only change is that you can simply now omit the metrics-url attribute in
 the font configuration and the font will still load. What you lose in
 this case is the ability to manually tweak the XML metrics file or to
 specify WinAnsi encoding for TrueType fonts. Furthermore, you currently
 don't have control over TrueType Collections. The WinAnsi feature should
 not be necessary anymore now that we have ToUnicode CMaps. The
 Collection feature is easily added again through an additional attribute
 in the font configuration.

... and I don't see why one would need to tweak the metrics of a font?


 I'd be grateful if anyone could test what happens if you load a Type 1
 fonts on a Unix where the PFM file has an uppercase extension, e.g.
 FUTURA.PFM. I have to construct the URI to the PFM manually from the
 PFB and use a lowercase extension (.pfm). Maybe we have to improve
 that to check with upper- and lowercase to account for case sensitive
 file systems.

This fails. IMO the safest solution is to require that the name of the
pfm file is specified in the config file. All the more so it should also
be possible to give an afm file instead of a pfm one (BTW, Fop can't
read afm files, can it? Because FOrayFont can...).

If this is ok for you I'll implement these changes. Oh, or perhaps wait
that a final decision about FOrayFont is made.

Vincent


Status of the collapsing border model

2006-11-15 Thread Vincent Hennebert
Hi all,

Just to let you know that I'd like to finish the implementation of the
collapsing border model.
I've started to look at the wiki pages, the code and the mail archives
but if you have any hint about what are the remaining problems to solve,
where to look at in particular, etc., I'm all ears ;-)

Thanks,
Vincent


Questions regarding the table layout code

2006-11-24 Thread Vincent Hennebert

Hi guys,

As you may have noticed I have started to work on the table layout code.
For now I have just made some small improvements, mainly added javadoc
comments and renamed variables into names I believe are more explicit.
Please tell me if there are changes or javadoc comments you don't agree
with.

Now I'd have a first bunch of questions for those who are familiar with
that part of the code. Here they are:
- in TableContentLayoutManager.getKnuthElementsForRowIterator: when a
  new row-group is fetched, its possible break-before seems to be taken
  into account only if the current Knuth element list ends with a
  penalty item. I suspect this is a bug, but would like to have
  confirmation.

- in TableRowIterator.java:
  - when the end of a table-part (table-header, -footer, -body) is
reached and there are pending spans, a new EffRow is created to
contain the remaining spans. Is that really desirable? When there
are no explicit table-row elements in the table I agree with that
behavior. But when table-rows are explicit and a cell in the last
row must span over several rows, I would bet this is an error in the
input FO file (the 1.1 recommendation states that spans over several
table parts are forbidden). Wouldn't it be better to raise an error
in such a case?
  - if there are several table-bodies, LAST_IN_PART will be set only on
the last row of the last table-body. Is that behavior really
intended? I would say no as AFAIU this flag is used for border
resolution

- IIUC, a PrimaryGridUnit is meant to be the before-start (top-left)
  grid unit of a spanned cell, while GridUnit is for the other grid
  units (and thus only appear in spanned cells). Is that a design
  choice, or just a side-effect? I'd like to add an explanation of that
  in the javadoc of PrimaryGridUnit.

- I seem to have seen somewhere that normalizing tables when building
  the FO tree would ease life; that is, table-row FO objects would be
  created for tables which don't contain any (and rely on the
  starts-row/ends-row properties). Apparently that's not done in the
  current code. Just to know, is it due to a lack of time, or a design
  decision? With my current understanding of the whole issue it seems to
  me that that would indeed ease things at the layout stage, but I may
  have missed something.

- it seems that the getNextKnuthElements method is meant to return each
  time a forced break is encountered. Is that a design requirement? Can
  I add that point to the method's javadoc?


That's it for now. But get ready for a second whole bunch later ;-)
Thanks,
Vincent


Re: Problems with display-align

2006-11-30 Thread Vincent Hennebert
Chris Bowditch a écrit :
 Vincent Hennebert wrote:
 
 Hi Bradley,

 Bradley Harrington a écrit :

 Hello,
 I don't know if this is a known problem or not, however the
 display-align attribute always puts the text at the top of the cell for
 me. I have tried this with both trunk and 0.92 beta. It does work with
 other XSL parsers I've tried. Here is the sample code:


 It's not implemented. And because of the Knuth approach it doesn't seem
 entirely trivial to implement it. Good news is that I'm currently
 working on tables, so I may find a way to do it. We'll have to
 eventually, anyway. Stay tuned.
 
 Strange. I thought this worked. I just ran a simple test and it appears
 to work for me. PDF attached.

Hmmm, by looking at the code I had the feeling it wasn't implemented,
but I may well be wrong. I'll have another look. More later.

Vincent


Re: FOrayFont integration in question

2006-12-01 Thread Vincent Hennebert
Hi All,

So, not many opinions on this it seems. Thanks to Bertrand and Jeremias
for their comments.
I'll need to have a closer look at the current font library. As I was
supposed to replace it with FOrayFont I have never studied it in detail
yet. Then I'll see if it is best to keep it or to switch to a fork of
FOrayFont. Although right now I've the feeling the former solution is
preferable.
My first two goals are to polish the removal of the XML metrics
generation step (mainly add an optional parameter in the config file for
specifying the name of the PFM file), and add support for AFM metrics files.

Then... we'll see.

Cheers,
Vincent



Re: DO NOT REPLY [Bug 41019] - Left-align oddness with long, unbreakable strings following

2006-12-06 Thread Vincent Hennebert
Hi Luca,

Luca Furini a écrit :
snip/
 1) TextLM breaks the text even when a / or a - is found, handling
 them as hyphenation points with the usual sequence of glue + penalty +
 glue elements.
 
 The LineLM tries, in the first instance, to avoid using hyphenation
 points, so the penalty is not taken into account. But this has the side
 effect of using the first glue element as a feasible break (if the
 penalty were a feasible break too, it would surely be a better one, such
 avoiding the glue to be effectively chosen).

I don't follow you: IIUC the glue-penalty-glue triplet is generated only
the second time, when the first breaking doesn't give acceptable
results? What do you mean by the penalty is not taken into account?

Also, I don't see why the penalty would be preferred over the glue, as
it has a positive penalty value.


 This is probably the smaller of the problems, and can be solved just
 adding an infinite penalty before the first glue element. But maybe we

This seems to be a good idea, anyway.


 want to prevent this breaking to happen, as we can now use
 zero-width-spaces to explicitly insert breaking positions?

Good point. I'd say yes for '/'. This would add a burden to the user who
would have to modify the FO generation step to add ZWSP for URLs or
filenames; but we must also take into account cases where the user does
/not/ want the word to be split at '/' characters.
For hyphens, I would keep the current behavior, as this is the most
expected one IMO. And it can also be prevented by adding non-breaking
zero-width space.


 2) The presence of an inline object larger that the available width
 makes the algorithm to deactivate all the active nodes and then restart
 with a second-hand node, as no line can be built that does not
 overflow. The restarting node was chosen, in
 BreakingAlgorithm.findBreakingPoints(), between lastTooShort and
 lastTooLong, neither of them being a good breaking point. There is a
 lastDeactivated node chosen among the deactivated nodes but it was not
 used.
 
 A deactivated node previously was an active one, so it is surely better
 than a node who failed to qualify; replacing either lastTooShort or
 lastTooLong (according to the adjustment) with lastDeactivated leads to
 a better set of breaks. However, this in not enough. The attached file
 small.20.pdf shows the result after fixing these first two problems.
 
 
 3) At the moment, the LineLM can call findBreakingPoints() up to three
 times, the last one with a maximum adjusting ratio equal to 20. I came
 to the conclusion that this is really TOO much. I tried stopping after
 the second call (with max ratio = 5) and the result is much better (see
 attached file small.5.pdf).

Yes 20 is probably too much. We need perhaps to also differentiate the
case where no acceptable line-breaking can be found because of a box too
long to even fit alone on one line. In such a case even a very high max
ratio won't help.


 A high maximum adjustment ratio means that the algorithm is allowed to
 stretch spaces a lot in order to find a set of breaks which is
 *globally* better; this means that it can choose some not-so-beautiful
 breaks in order to build a set spanning over a larger portion of the
 paragraph.
 
 In our example: there can be a break just before the long url (a line
 ending after Consider:) only if we use an enormous adjustment ratio.
 With a smaller, more appropriate threshold, Consider: can no more end
 a line, so the algorithm will restart from a previous point.
 
 
 In conclusion: the first two items are easily fixed, and I'm going to
 commit the changes in the afternoon (in there are no objections);
 concerning the question of the automatic break at /- characters, I'll
 probably leave the code unchaged for the moment, until we decide what is
 best.
 
 Concerning point #3, I'm going to have a closer look at the restarting
 mechanism ...

Yes, the current mechanism doesn't seem to be good enough, but I'm
wondering if we can find a better one. Currently a too-short/too-long
node replaces another one if it has fewer demerits. The number of
lines/pages handled so far isn't taken into account. So this is likely
that a too-short/too-long node ending an earlier line/page will be
preferred over a node going further in the Knuth sequence. Why should
that be the case?
In fact the main problem I think is to find the right heuristic to
select too-short/too-long nodes, in order to end up with the most
acceptable result. Easy to say...

Also, may I suggest you to look at the Temp_Floats branch, and perhaps
even working on it instead of trunk? I've made quite heavy changes to
the breaking code that might be difficult to merge back into the trunk
if there are also changes there.


Cheers,
Vincent


Re: DO NOT REPLY [Bug 41019] - Left-align oddness with long, unbreakable strings following

2006-12-06 Thread Vincent Hennebert
Hi guys,

J.Pietschmann a écrit :
 Simon Pepping wrote:
 Would this be a good moment to make these features of the breaking
 algorithm user configurable, like they are in TeX? This allows people
 to play with the various possibilities without having to modify the
 code.

This can be combined with parameters for configuring the handling of
before-floats. We might want to have a coherent set of parameters here.
I was thinking about creating extension parameters in the fox:
namespace. As those are things that have to be independently set for
each FO file IMO, rather than having them in the Fop config file. I'll
try to work on that soon.


 Probably, if this can be combined with implementing UAX14.

This may be time to look at Simon's generalized Knuth elements for
linebreaking. I wanted to but haven't had the time yet, and I'm still
missing some knowledge regarding UAX14. Damn, so many things to do in so
little time. Not speaking of releasing 0.93...


Vincent


  1   2   3   4   5   6   7   8   9   >