On Sep 17, 2008, at 13:26, Vincent Hennebert wrote:

Hi Vincent

Andreas Delmelle wrote:

Not necessarily. Maybe the author was just trying to be creative
(avoiding a trip through SVG, by using tables as a medium-resolution
grid; something like that?)

I know users can be very creative when it comes to using tables. Still.
Assuming there’s a use case for the example you describe, this will
probably be to achieve a fancy layout in a magazine-like document. Not
a 10 thousand page document then, and the overhead won’t be noticeable.

Also, let’s not forget that we’re dealing with a corner case here. In
most cases empty table-cell elements /will/ be added by the XSLT
stylesheet that produces the FO document anyway. The fact that this
issue occurred only recently is a confirmation of that. Hence why the
mentioned overhead isn’t really an issue, and why specific code should
be kept to a minimum.

It's still /unnecessary/ overhead, if you ask me...
I know I've been a big fan of normalizing the FO tree in general, but still, if the cell wasn't there, then forcibly generating it (at parse-time) is not really worth the trouble. It will likely cost more than it saves.

<snip />
Now /that/ is maybe the place for optimizations: improve the FO tree.
For example make sure that no object is created if a property has its
default value;

Well, you'll be pleased to know that is *exactly* what happens now. 8-)

Property instances for default values are created only once (in FOPropertyMapping), and re-used for every occasion (in multi-session/ servlet context: created once at application startup, and re-used as long as the JVM does not terminate). My point is not so much the objects for the properties. It is the reference to them that is reserved in the containing FONode. /That/ is what makes the instance size of a Block rather large ATM, since it has quite a few of those. The space taken up by the instances that those references point to has already been optimized a great deal.

or use a common read-only object for the default value.
All the FO tree would benefit from that kind of optimization. But I’m
not a specialist of that area...

If I'm correct, only very little can be done to improve efficiency further in terms of memory usage in the FO tree /itself/. The focus should begin to move, and it is. As long as there is no way in which those FONodes are going to be released before much later, when their areas have been added, the impact of further reductions there is going to be marginal. You will get a hardly noticeable improvement in smaller documents, and maybe room for one more page for very large documents. Every extra FONode that can be added due to more savings in the FO tree, costs almost twice as much further on (one extra LM, and other layout-related objects)...

RowPainter is area-generation stage. My solution is just making sure areas are generated with the information that is available anyway. It's
not doing anything in a place where it shouldn't do it.

Ultimately, that is the only place where this should have effects, I think. Seems like we don't really need dummy nodes for anything layout- related,
so why precisely should they exist at that point?
Unless a darn' *good* answer to this question is given (i.e. a point
other than a rigid adherence to the 'no-duplication' rule)

But this is exactly that. A rigid, strict, blind adherence to the
‘no-duplication’ rule. Because code duplication almost always is a sign
that something is wrong in the architecture.
Because something that looks like just a little redundancy generally
reveals a deeper problem, that grows with time and eventually may totally
prevent the implementation of new features.

Sure. I very much agree that code duplication needs to be avoided. The architecture of FOP, OTOH, is precisely not to be viewed as one picture. Much effort has been made in the past to more clearly draw the distinction between the parsing/layout/rendering modules, exactly to make implementing new features easier. If a feature can be implemented by modifying one of those modules, then why spread the modifications the other(s)?

If architecture is a point, then making the adjustment only in the layout package, if possible, is preferable. We should, as much as possible, avoid making changes in the FO tree that the layoutengine comes to depend on too much. IIRC, it was always the intention of getting to a point where they are reasonably independent components: one to build the FO tree, one to perform the layout for a given FO subtree, one to render the resulting area tree, one to manage the events produced by the others... This precisely to facilitate implementation of new features, different types of renderer, etc.

Ultimately, as I recall, it was always the long-term intention that the layout package would contain no hard references to FOP's own fo package (apart from in classes like LayoutManagerMapping, maybe), but the FO tree would be exposed as a set of interfaces (possibly to be made standard among FO processors; I recall Victor Mote trying to raise a discussion about this quite some time ago). One FO tree implementor may choose to use SAX events to build a FO tree structure (as FOP's does), another might choose to build a FO tree backed by a proprietary lazy DOM (or collection thereof?). Yet another may choose an entirely different approach, perhaps only remotely related to any known XML API. As long as it could be mapped to standard interfaces, it wouldn't matter /where/ the FOs come from exactly, the layoutengine would ideally always be able to work with it. Not that something like that seems very close ATM...

<snip />

What is the more precise description of this 'additional maintenance' in
this case?
Is it really /that/ much that it justifies the generation of additional
relatively large and long-lived objects so early in the process?
(Objects that basically correspond to nothing at all; maybe due to code
that was added just to make the code look cleaner... ;-))

What do you mean?

What it says: adding some code that generates unnecessary overhead, for the sake of the architecture only, is virtually always a bad idea, unless there are other compelling reasons to do so.

<snip />

Maybe if we leave it, but take note for now, a better solution will
become apparent once the refactoring is done (?)

Maybe, but usually the discussion is carried out prior to the commit...
And in that case proper TODOs are placed at proper places in the code.
And possibly a bug report opened.

Point. I'm definitely not saying the response was unwarranted.

As such, there are always going to be 'unfinished' pieces of code, or
parts that can be improved upon. The base rule remains: if it works, and it didn't work before, the balance is positive. If we take that simple
rule away, this will in the long run be counter-productive, I think.

I have to strongly disagree with that. If it works, and it didn’t work
before, /and/ the code is kept as clean, readable and maintainable as
before, /then/ the balance is positive.

That's what review is there for. To make sure that the balance does not tip over to the other side.

'clean, readable and maintainable' are far from absolutes. What works for A might not work for B. You're very free to point that out. I'm very free to do with it whatever I wish (which obviously depends somewhat on the tenor of the reaction... and my mood ;-)).

Look: alright, the implementation of missing cells is not optimal, but it’s no big deal.
Then we have to fix the current shortcomings of border resolution.
Another slightly sub-optimal change, but again no big deal. Then we
implement this functionality. And then that feature. Every time a small,
maybe-not-optimal-change-but-no-big-deal. And we end up with
a completely unmaintainable code base, where the only solution to
further implement features is to re-write it from scratch. But I believe
I’m repeating myself.

Heh, don't we all from time to time...?

My point is much the same: choosing an approach with a negligible (but still additional) overhead, once, by itself is not much. When we would make it a common practice, the code becomes stuffed with a lot of negligible overhead that was designed in, that together could mount up to a significant overhead, which further on becomes very difficult to optimize, and would then warrant a rewrite/major refactoring one way or the other.


Cheers

Andreas

Reply via email to