Hi Johan,
I didn't look in detail yet (and checked with the code), but your
analysis makes sense. I believe the root problem, though, is lack of
documentation/specs. Our knowledge is mainly based on reverse
engineering and reading the code. That allowed me to isolate and
describe the "problem" I described in my original post, but even that
is far from complete, as the layout process spans more classes and
phases. I typically use pen and paper to draw the flow for isolating
issues like this, but those pictures are never complete.
The second thing is, testing. Ideally, that should be related to
specs, but in absence of specs we can test the current situation, and
mark that as the lower bar. However, that implies that tons of tests
would need to be added. And even then, there might be fixes that in
general would lead to less pulses, but in corner cases might have a
slight performance penalty.
In summary, I believe the specification part is the most important
one. That will enable rigorous testing. With the headless platform in
place now, those tests do not even require a real graphics pipeline.
- Johan
On Thu, Oct 23, 2025 at 2:49 AM John Hendrikx
<[email protected]> wrote:
Johan, I had a closer look.
TLDR:
- FX currently almost always does two layout passes, with the 2nd
pass not changing anything, due to a flaw in Node's relayout
trigger logic
- I think it may be possible to fix this, and as a side effect
also fixes your infinite layout loop scenario
Longer version:
Although I think the original fix is correct (as it fixes broken
state tracking), I do see that the fix may cause existing flaws,
that were swept under the rug before, to become more problematic.
I think the fundamental reason for the relayout trigger code in
Node on LayoutX/Y changes is to catch cases when these are
modified OUTSIDE the scope of its direct Parent's layoutChildren
call. In fact, it seemingly does its best to detect when a change
occurs as part of a Parent's layoutChildren call as seen in this
code and the clarifying comment:
// Propagate layout if this change isn't triggered by its parent
if(p != null&& !p.isCurrentLayoutChild(Node.this)) {
However, this check utterly fails in almost all situations, for
the simple reason that the child node which has its layout X/Y
changed is NOT considered the current layout child by its parent.
None of the containers will update the current layout child before
doing each layoutX/Y change (as that would be very cumbersome...)
However, we're not actually interested in that. All we want to
know is:
*** Is this layout X/Y change performed while the direct
Parent's layoutChildren() call is higher up in the call stack? ***
There are two ways to answer that question:
- One can ask the *grandparent* if the child's parent is the
current layout child -> the answer would be yes, BUT it only
means its layout() is running, and often also its layoutChildren()
(so this isn't perfect)
- Another option is to ask the direct parent if it is currently
performing layout -> the answer would be yes, and it normally
means its layoutChildren is running, but the flag isn't cleared
immediately after the call completes, so this is also a bit
flawed; excerpt from Parent::layout():
performingLayout= true;
layoutChildren(); ... (more code that triggers layouts for the
rest of the sub tree) ...
performingLayout= false;
The flag however is incredibly close to what we need, and an extra
tighter flag that is reset immediately after `layoutChildren`
completes (instead of after the whole subtree completes layout)
would be exactly what we need.
So now if Node wants to know if a layout X/Y is changed because
the parent's layout children triggered it, then all it needs to
ask the parent node is if this new flag is set.
With this fix, your scenario would also no longer trigger a layout
loop, assuming the node you are modifying is a direct child of the
node on which layoutChildren is overridden.
The only reason that currently this broken logic is not already
triggering infinite layouts is that a 2nd pass will set all values
to the same values, and Node will not trigger its logic if the
value was unchanged. However, its already broken that a 2nd pass
is necessary at all. In effect, JavaFX is currently almost always
doing **two** layout passes; one that changes all the positions
and sizes, and another that "confirms" these unnecessarily.
As it is always dangerous to play with the layout logic, this will
need to be well tested, but I think we really need to fix this, or
revert the JDK-8360940 change as I think it may otherwise cause
too much problems, even though it is fixing problems and just
exposing new ones...
--John
On 21/10/2025 21:42, Johan Vos wrote:
I recently noticed a performance degradation in a JavaFX
application, that I could trace to the fix for JDK-8360940 [1]. I
believe the fix is correct, but it can cause infinite pulses in
some cases. It is somehow discussed in the second of the 2 errors
in Parent, in [2], but I believe this pattern is used often, and
will now lead to infinite pulses
For example, a class extending Parent and implementing
layoutChildren that does something like this:
@Override
protected void layoutChildren() {
super.layoutChildren();
if (someSpecificCondition) {
someManagedNode.relocate(someX, someY);
}
}
Support the someManagedNode is at (x0, y) before the pulse runs.
The call to super.layoutChildren() will set the layoutX/Y to e.g.
superX, superY and it will request a new pulse (in case superX !=
x0 or superY != y).
Next, the someManagedNode is conditionally repositioned to someX,
someY .
In the next pulse (immediately following this pulse), the
super.layoutChildren will again set the layoutX/Y to superX,
superY, and trigger a new pulse, no matter what happens
afterwards. The conditional code will again reset the layoutX/Y
to someX, someY (which is what they had before this pulse started
layoutChildren()), but the next pulse is scheduled AND it will
visit this Node.
Even though in the end the layoutX/layoutY properties of
someManagedNode are not changed as the result of layoutChildren,
a new pulse will be requested because there was an intermediate
value (which never got rendered, as it was changed again even
before the layoutChildren returns) that triggered the
requestNextPulse.
I think this pattern is used in a number of external controls. It
worked "by accident" until the issue in JDK-8360940 was fixed. In
defense of the pattern above, one could argue that a requestPulse
should only be triggered if the triggering property was altered
at the end of the layoutPhase (that is, if there is a difference
in value between the start and the end of the layoutPhase).
In summary, I don't propose to change anything, and I agree with
the commit in [1], but it might need some attention in the
release notes or docs.
- Johan
[1]
https://github.com/openjdk/jfx/commit/5682424bebae7947f665fcb11429c3d2e069e8e5
[2] https://bugs.openjdk.org/browse/JDK-8360940?focusedId=14811022